GenCore version 5,1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on : 



Title: 

Perfect score : 
Sequence : 

Scoring table : 



Searched: 



September 11, 2004, 00:41:05 ; Search time 8878.33 Seconds 

(without alignments) 
12582.818 Million cell updates/sec 

US-09-830-972-1 
3741 

1 attgctcgtctgggcggcgg gattgaagcgcaaagcagat 3741 

IDENTTTY_NUC 

Gapop 10.0 , Gapext 1.0 



27513289 seqs, 14931090276 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



55026578 



Database 



EST: 
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em_estba : * 
em_esthum: * 
em_estin : * 
em_estmu : * 
em_estov: * 
em_estpl : * 
em_estro : * 
emjitc: * 
gb_estl : * 
gb_est2 : * 
gb_htc: * 
gb_est3 : * 
gb_est4 : * 
gb_est5 : * 
em__estf un : * 
em_es torn: * 
em_gss_hum: * 
em_gss_inv: * 
em__gss_pln: * 
em_gss_vrt : * 
em_gss_fun : * 
em__gss_mam: * 
em_gss_mus : * 
em_gss_pro : * 
em_gss_rod: * 
em_gss_phg : * 
em_gss_vrl : * 



28: gb_gssl:* 
29: gb_gss2:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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RESULT 1 
BU839934 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BU839934 969 bp mRNA linear EST 16-OCT-2002 

AGENCOURT_894 7 611 NIH_MGC_130 Mus musculus cDNA clone IMAGE : 6329890 
5', mRNA sequence. 
BU839934 

BU839934. 1 GI : 24 024317 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 969) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Mark Maconochie, Ph.D. and Nancy L. Freeman, 
Ph.D. 

cDNA Library Preparation: ResGen, Invitrogen Corp 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Agencourt Bioscience Corporation 

Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 

Plate: LLAM13783 row: g column: 11 

High quality sequence stop: 651. 
Location/Qualifiers 
1. .969 

/organism="Mus musculus" 

/mol_type="mRNA" 

/db_xref="taxon: 10090" 

/clone=" IMAGE: 6329890" 

/lab_host="DH10B (phage-resistant ) " 

/clone_lib="NIH_MGC_130" 

/note="Organ: otocysts; Vector: pCMV-SP0RT6 . 1 ; Site_l: 
EcoRV; Site_2 : NotI; Cloned unidirectionally . Primer: 
Oligo dT. Average insert size 1.95 kb. Constructed by 
ResGen, Invitrogen Corp. Note: this is a NIH MGC Library." 



ORIGIN 



Query Match 20.6%; Score 772.4; DB 13; Length 969; 

Best Local Similarity 89.6%; Pred. No. 1.3e-103; 

Matches 878; Conservative 0; Mismatches 88; Indels 14; Gaps 4; 

Qy 2172 CAT GAAT GT AGC ACT AAAAG CT T T GGGAACAAAG GAAG GAATAAAAGAG C CT GAAAGT T T 22 31 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I II III I I I I I I I I I I I I I I I I I 
Db 1 CAT GAGT GT AGC ACT AAAAACAT C GGACT CAAAG GAAGAAAT T AAAGAGC C T GAAAGT T T 60 

Qy 2232 T AAT G CAGCT GT T C AG GAAACAGAAGCT C CT TAT AT AT C CAT T G C GT GT GAT T T AAT T AA 2291 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I 



Db 61 T AAT G CAGCT GC T CAGGAAG C AGAAG CT C CT TAT AT AT C CATT GC AT GT GAT T T AAT T AA 12 0 

Qy 2292 AGAAAC AAAGCT CT C CACT GAGC CAAGT C C AGAT T T CT C T AAT TAT T C AGAAAT AGC AAA 2351 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 121 AGAAACAAAG CT CT C CACT GAGC CAAGT C C AGAGT T C T CT AAT TAT T C AGAAAT AGC AAA 180 

Qy 2352 AT T C GAGAAGT C G GT GC C C GAAC AC G CT GAG CT AGT GGAGGAT T C CT C AC CT GAAT CT GA 2411 

III I I I I I I I I I I M I I II III I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 181 AT T T GAGAAGT CGGTGCCT GAT CACT GT GAGC T C GT G GAT GAT T C CT C AC C C GAAT CT GA 240 

Qy 2412 AC C AGT T GAC TT AT T T AGT GAT GAT T C GAT T C CT GAAGT C C C ACAAAC AC AAGAGGAG GC 2471 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I II I I I 
Db 241 AC CAGTT GACTT ATTTAGT GAT GATT CAATT CCT GAAGT CCCACAAACACAAGAGGAGGC 300 

Qy 2472 T GT GAT GCT CAT GAAGGAGAGT CT CACT GAAGT GT CT GAGACAGT AGC C C AGCACAAA — 2529 

I II I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I II I I I II M I II I II I I I 

Db 301 T GT GAT G CT AAT GAAGGAGAGT C T CACT GAAGT GT C T GAGACAGT AAC AC AACACAAACA 360 

Qy 2530 - GAG GAGAGACT T AGT G C CT C AC CT C AGGAGCT AGGAAAGC CAT AT T T AGAGT CT T TT CA 2588 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 TAAGGAGAGACTTAGT GCTT CAC CT CAGGAGGTAGGAAAGCCATAT TT AGAGT CTTTT CA 420 

Qy 2589 GCC CAATTTACATAGTACAAAAGAT GCT GCAT CTAAT GACATT CCAACATTGAC CAAAAA 264 8 

II II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 421 GCCCAAT TTACATATTACAAAAGAT GCT GCAT CTAAT GAAATT CCAACAT T GAC CAAAAA 480 

Qy 2649 G GAGAAAAT T T CTT T GCAAAT GGAAGAGT T T AAT AC T GCAATT T AT T CAAAT GAT GACTT 2708 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 81 G GAGACAAT T T CTT T GCAAAT GGAAGAGT T T AAT AC T GCAATT TAT T C CAAT GAT GACT T 54 0 

Qy 2709 ACT T T C T T CTAAGGAAGACAAAAT AAAAGAAAGT GAAAC AT T T T C AGAT T CAT CT C C GAT 2768 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 
Db 541 ACT T T C T T CTAAGGAAGACAAAAT GAAAGAAAGT GAAAC AT T T T C C GAT T CAT C T C C CAT 600 

Qy 2769 T GAGAT AAT AGAT GAATT T C C CAC GT T T GT C AGT GCT AAAGAT GATT C T C CT AAAT TAG C 2828 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 601 T GAGAT AAT AGAT GAGTTT CCCACATTT GT CAGT GCTANAGAT GATT CT C CT 652 

Qy 2829 CAAGG AGT ACACT GAT CTAGAAGT AT CCGACAAAAGT GAAATT GCT AAT AT CCAAAGCGG 2888 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 653 - AAGGAGT ACAC T GAC CTAGAAGT AT C CAACAAAAGT GAAAT T GCT AAT GT C C AGAGC GG 711 

Qy 2 8 89 GGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAATATATATCC 2 94 8 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 712 NGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGACCTTTCTTTCAAGAATACATATCC 771 

Qy 2949 T AAAGAT GAAGT ACATGTTTCAGAT GAATT CTCCGAAAATAGGTCCAGTGT AT CTAAGGC 3008 

I II I I II I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I 
Db 772 T AAAGAT GAAGC AC AT GT CT C AGAT GAAT T C T - CAAAAGT AGGT C CAGT GT AT CT AAGGT 830 

Qy 3009 AT CCATATCGCCTT CAAAT GTCTCTGCTTTGGAACCTCAGACAGAAATGGGCAGCAT AGT 3068 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 831 GCCCCTATTGCTTCCCAATGGTTTCTGCTTGGAATCTCAAATAG-AATGGGCCACATAGT 889 

Qy 3069 T AAAT C CAAAT CAC TT AC GAAAGAAGC AGAGAAAAAACT T C CT T CT GAC ACAGAGAAAGA 3128 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 890 T T AAC C CAAAGT ACTT AC GGAAGAAG C AGAG GAAAAACT T C CT T CTT GAT C C GAGAAAGA 94 9 



Qy 3129 G GACAGAT C C C T GT C AGCT G 3148 

II I I II I I II I I I I I I 
Db 950 GGGACGATCCCTGACAGCTG 969 



RESULT 2 
CA511870 

LOCUS CA511870 785 bp mRNA linear EST 15-NOV-2002 

DEFINITION UI-R-FJ0-cpx-e-15-0-UI.rl UI-R-FJO Rattus norvegicus cDNA clone 

UI-R-FJ0-cpx-e-15-0-UI 5 1 , mRNA sequence. 
ACCESSION CA511870 

VERSION CA511870.1 GI:25002824 

KEYWORDS EST. 

SOURCE Rattus norvegicus (Norway rat) 

ORGANISM Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; 
Rattus . 

REFERENCE 1 (bases 1 to 785) 

AUTHORS Bonaldo, M. F. , Lennon,G. and Soares, M.B. 

TITLE Normalization and subtraction: two approaches to facilitate gene 

discovery 

JOURNAL Genome Res. 6 (9), 791-806 (1996) 
MEDLINE 97044477 
PUBMED 8889548 
COMMENT Contact: Soares, MB 

Coordinated Laboratory for Computational Genomics 
University of Iowa 

375 Newton Road , 4156 MEBRF, Iowa City, IA 52242, USA 

Tel: 319 335 8250 

Fax: 319 335 9565 

Email: bento-soares@uiowa.edu 

Tissue Procurement: Dr. James Lin, Universtiy of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Researchers may obtain clones from Research 
Genetics (www.resgen.com). 
Seq primer: M13 REVERSE. 
FEATURES Location/Qualifiers 
source 1. .785 

/organism^" Rattus norvegicus" 

/mol_type="mRNA" 

/strain="Sprague-Dawley" 

/db_xref="taxon: 10116" 

/clone="UI-R-FJ0-cpx-e-15-0-UI" 

/ tissue_type="embryo" 

/ dev_stage="embryo" 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib= "UI-R-FJO" 

/note="Vector: pYX-Asc; Site_l: EcoR I; Site_2 : Not I; 
UI-R-FJO is a cDNA library containing the following 
tissue (s) : rat embryo. The library was constructed 
according to Bonaldo, Lennon and Soares, Genome Research, 
6:791-806, 1996. First strand cDNA synthesis was primed 
with an oligo-dT primer containing a Not I site. Double 



stranded cDNA was ligated to an EcoR I adaptor, digested 
with Not I, and cloned directionally into pT7T3-Pac 
vector. The oligonucleotide used to prime the synthesis of 
first-strand cDNA contains a library tag sequence that is 
located between the Not 1 site and the (dT) 18 tail. The 
sequence tag for this library is CATCTCTACT. This library 
was created for the University of Iowa Program for Rat 
Gene Discovery and Mapping (Val Sheffield, Bento Soares 
and Tom Casavant) " 

ORIGIN 

Query Match 20.5%; Score 767.2; DB 14; Length 785; 

Best Local Similarity 99.2%; Pred. No. 7.7e-103; 

Matches 780; Conservative 0; Mismatches 5; Indels 1; Gaps 1; 

Qy 1699 AT AAC AGAGAAGAC TAG C C C C AAAAC GT CAAAT CCTTTCCTT GT AGCAGT ACAGGATT CT 1758 

I M I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I 
Db 1 AT AAC AGAGAAGACT AT C C C C AC AAC GT CAAAT CCTTTCCTT GT AGCAGT AC AG GAT T CT 60 

Qy 1759 GAGGC AGAT TAT GT T ACAAC AGAT AC CT T AT CAAAGGT GAC T GAG G CAGC AGT GT CAAAC 1818 

M I I I I I I I I II I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I 
Db 61 GAGGC AGAT TAT GT T ACAACAGAT AC CT T AT CAAAGGT GACT GAGGCAGC AGT GT CAAAC 120 

Qy 1819 AT GC C T GAAG GT CT GAC G C C AGAT T T AGT T C AG GAAGC AT GT GAAAGT GAACT GAAT GAA 187 8 

I I M I I I I I II I I II I I I I I I I I I I I I I I I I I I I || I I I I I | I I | | | | | | | | | | | M | | | 
Db 121 ATGCCT GAAGGTCT GACGCCAGATTTAGTTCAGGAAGCATGTGAAAGT GAACTGAATGAA 180 

Qy 1879 GC CACAGGT ACAAAGAT T GC T TAT GAAACAAAAGT G GACTT GGT C C AAACAT C AGAAG CT 1938 

I I I M I I II I I I I I I I I II I I I I I I II I I I I I I I I I I I 1 I I I I I I I I I I I || I I I | | | | | 
Db 181 GC CACAGGT ACAAAGAT T GCT T AT GAAACAAAAGT GGACTT GGT C CAAAC AT C AGAAGC T 240 

Qy 1939 AT ACAAGAAT C ACT T T AC C C C AC AG CAC AG CT T T GC C CAT C AT TT GAGGAAGCT GAAG C A 1998 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I 
Db 241 AT ACAAGAAT CACTTT AC C C CACAGCACAGCTTT GCC CATCATTT GAGGAAGCT GAAGCA 300 

Qy 1999 ACTCCGTCACCAGTTTTGCCTGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCA 2058 

I I I I I M I I I I I I I I I II I I I I II I I I II I I I I II I I II I I I I I I I I I I I I I I I I I I I I I 
Db 301 ACTCCGTCACCAGTTTTGCCTGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCA 360 

Qy 2059 AGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCTCCAGTT 2118 

I M I I I I I I II II M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 361 AGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCTCCAGTT 42 0 

Qy 2119 AGTTATGACAGTATAT^AGCTTGAGCCTGAAAACCCCCCACCATATGAAGAAGCCATGAAT 217 8 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I II I I I I I I I I I M 
Db 421 AGT TAT GACAGT AT AAAG CT T G AGC C T GAAAAT C C C C CAC CAT AT GAAGAAGC CAT GAAT 480 

Qy 2179 GTAGCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAGCCTGAAAGTTTTAATGCA 2238 

I I I I i I I M I II I I I I I I I I I II I I I I I I I I I II I I I I I I I M I I I M I I I I | | | II I I I 
Db 481 GTAGCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAGCCTGAAAGTTTTT^ATGCA 54 0 

Qy 2239 GC T GT T C AGGAAAC AGAAGCT C CT TAT AT AT C CAT T GC GT GT GAT TTAAT T AAAGAAAC A 2298 

I I I I I M I I I I I I I I I I I I I I I I I I | | I I I I | | | | M | | | | | | | | | | | | | | || | | | | | | | 
Db 541 G C T GT T C AG GAAACAGAAGC T C CT T AT AT AT C CAT T G C GT GT GAT T T AAT TAAAGAAACA 60 0 

Qy 2299 AAGCT CT C C ACT GAGC C AAGT C C AGAT T T CT CT AAT TAT T C AGAAAT AG CAAAAT T C GAG 2358 

M I I I I I I I I I I I I I I I I I I II II 1 I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



601 AAGCT C T CC ACT GAG C CAAGT C C AGAT T T C T CTAAT TAT T C AGAAAT AGC ANAAT T C GAG 660 



Qy 2359 AAGT CGGTGCCC GAAC AC G C T GAG C T AGT G GAG GAT T C C T CAC CT GAAT CT GAAC CAGT T 241E 

I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 661 AAGT CGGTGCCCGAACACGCTGAGCTAGTGGAGGATTCCT CAC CT GAAT CT GAAC CAGTT 72 0 

Qy 2419 GACT T AT T T AGT GAT GAT T C GAT T C CT GAAGT C C C ACAAACACAAGAG GAG GCT GT GAT G 24 7 E 

M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I 
Db 721 GAC T TAT T T AGT GAT GAT T C GAT T C CT GAAGT - C CACANAC AC AAGAG GAG GCT GT GAT G 779 

Qy 2479 CTCATG 2484 

MINI 

Db 780 CTCATG 785 



RESULT 3 
BU709149 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

' TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BU709149 842 bp mRNA linear EST 15-JUL-2003 

UI-M-EWO-caz-o-lO-O-UI. rl NIH_BMAP_EW0 Mus musculus cDNA clone 
IMAGE: 6419553 5', mRNA sequence. 
BU709149 

BU709149. 1 GI: 23642332 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 842) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. James Lin, Univeristy of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / /image . llnl . gov 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/ Qualifiers 
1. .842 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 6419553" 
/tissue_type="whole brain" 
/dev_stage="embryo 15.5 dpc" 
/lab_host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_EWO" 

/note="Organ: brain; Vector: pYX-Asc; Site_l: EcoR I; 
Site_2 : Not I; The library was constructed according to 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 



1996. Denatured mRNa was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with an 
oligo-dT primer containing a Not I site. Double stranded 
cDNA was size selected according to mRNA size fraction, 
ligated with EcoR I adaptor, digested with Not I, and then 
cloned directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA 
tail, is GTGCGTGGAA. This library was created for the 
University of Iowa Mouse Brain Molecular Anatomy Project 
(BMAP) : 'Gene Discovery in the Developing Mouse Nervous 
System' , supported by National Instututes of Mental Health 
(NIMH) , Hemin Chin, Ph.D., program coordinator." 

ORIGIN 

Query Match 20.1%; Score 753.4; DB 13; Length 842; 

Best Local Similarity 94.2%; Pred. No. 8.2e-101; 

Matches 792; Conservative 0; Mismatches 48; Indels 1; Gaps 1; 

Qy 1677 AGAAGAAAGGAAGGC C CAAAT T AT AACAGAGAAGACT AG C C C CAAAAC GT CAAAT C C T TT 1736 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II I I I 

Db 2 AGAAGAAAG GAAG GC C CAAAT TAT AACAGAGAAGACT AGC C C CAAAAC GT CAAAT C CTTT 61 

Qy 1737 C C T T GT AGC AGT ACAG GAT T CT GAG G C AGAT TAT GTT ACAACAGAT AC CT T AT CAAAGGT 17 96 

I i I ■ I I I . I I II I I I I I I I I II I I I I II I I II I I I I I I II I I I I I I I I I I I I I II 
Db 62 C CT T GT AGCAAT AC AT GAT T CT GAGG C AGAT TAT GT C ACAACAGAT AAT T TAT CAAAGGT 121 

Qy 17 97 GAC T GAGGCAG C AGT GT CAAAC AT GC CT GAAGGT CT GAC G C C AG AT T T AGT T C AGGAAG C 1856 

M I I I I I I I I I I I II III I II II I I I I I I I I II I II I I I I I I I | | | | | | | | | M I I 
Db 122 GAC T GAGG C AGT AGT G G CAAC CAT G C CT GAAGGT CT AAC GC C AGAT T T AGT T C AG GAAG C 181 

Qy 1857 AT GT GAAAGT GAAC T GAAT GAAG C C AC AG GT ACAAAG AT T GCT T AT GAAACAAAAGT GGA 1916 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I | | | | | | | | | | | | | | | | | M I I I I 
Db 182 AT GT GAAAGT GAACT GAAC GAAGC C ACAG GT ACAAAGAT T G C T TAT GAAACAAAAGT G GA 241 

Qy 1917 C T T GGT C CAAAC AT C AGAAG CT AT ACAAGAAT C AC T T T ACC C CACAGC AC AG CTTTGCCC 197 6 

I M I I I I I I M I I I I I I I I I II I I III I M I I II I I I I I I II I I I I I I I I I I 

Db 242 CT T GGT C CAGAC AT CAGAAGCT AT ACAAGAGT CAAT T T ACC CCACAG C AC AGCT T T GC C C 301 

Qy 1977 AT CAT T T GAG GAAGCT GAAG CAACT C C GT CAC C AGT T T T GC CT GAT AT T GT TAT GGAAGC 2036 

I I I I I I I I I I I M I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I || I I I I I I I 
Db 302 AT CAT T T GAGGAAG C T GAAG CAAC T C C GT CAC C AGT T T T GC CT GAT AT T GT TAT G GAAG C 361 

Qy 2037 ACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCC 2096 

I I I I I I I I I I I I I M I II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 362 GCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAGTGCATCCCC 421 

Qy 2097 AC T G GAAG C AC CT C CT CCAGTT AGT T AT GAC AGT ATAAAGCT T G AGC CT GAAAAC C C C C C 2156 

IN I I I I Ml I I I I I I II I I I I I I I I I I I I I I I I I I I I I || I I I I I I || | I I I I 
Db 422 ACT AGAAGT AC C GT CT CCAGT T AGT T AT G AC GGT AT AAAG C T T GAGC C T GAAAAT C C C C C 481 

Qy 2157 AC CAT AT GAAGAAGCCAT GAAT GTAGCACTAAAAGCTTT GGGAACAAAGGAAGGAATAAA 2216 

I M I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I II I I I I I I I I I Ml II 
Db 482 ACCAT AT GAAGAAGCCAT GAGTGTAGCACTAAAAACATCGGACTCAAAGGAAGAAATTAA 541 

Qy 2217 AGAGC CT GAAAGT T T T AAT GCAG C T GT T C AG GAAAC AGAAGCT C C TT AT AT AT C CAT T GC 2276 

M I I I I I II I II I I I II I I I I I I II I ',11. I I I I I I I I I I I | M I I I I I I I I I I I 
Db 542 AGAG C C T GAAAGT T T TAAT G CAG CT GCT C AGGAAG CAGAAG C T C C T TAT AT AT C CAT T G C 601 



Qy 2277 GT GT GAT T TAAT TAAAGAAACAAAG CT CT C CAC T GAG C CAAGT C C AGAT T T C T C TAAT T A 2336 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 602 AT GT GAT T TAAT T AAAGAAAC AAAGCT CT C CACT GAGC CAAGT C C AGAGT T CT CTAAT T A 661 

Qy 2337 TT CAGAAAT AG CAAAAT T C GAGAAGT C GGT G C C C GAAC AC GCT GAG CT AGT G GAGGAT T C 2396 

I I I I I M I I I I I I I I I I I I I I I M I I I I I I I II III I I I I II Mill I I I I I 
Db 662 T T CAGAAAT AGCAN AAT T T GAGAAGT CGGTGCCT GAT CAC T GT GAGCT C GT GGAT GAT T C 721 

Qy 2397 C T CAC CT GAAT C T GAAC C AGT T GAC T TAT T T AGT GAT GAT T C GAT T C C T GAAGT C C C ACA 2456 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I 
Db 722 CT CAC C C GAAT CT GAAC CAGT T GACT T AT T T AGT GAT GAT T CAAT T C C T GAAGT - C C ACA 78 0 

Qy 2 457 AACACAAGAGGAG GCT GT GAT GCT CAT GAAG GAGAGT CT CACT GAAGT GT CT GAGAC AGT 2516 

II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I II 

Db 781 NACACAAGAG GAGGC T GT GAT GCT AAT GAAGGAGAGT C T CAC T GAAGT GT C T GAGAC AGT 84 0 

Qy 2517 A 2517 

I 

Db 841 A 841 



RESULT 4 
CB204418 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



linear EST 05-FEB-2003 



FEATURES 

source 



Craniata; Vertebrata; Euteleostomi ; 
Sciurognathi; Muridae; Murinae; Mus , 



CB204418 896 bp mRNA 

AGENCOURT_11276017 NIH_MGC_135 Mus musculus cDNA clone 
IMAGE: 30138586 5', mRNA sequence. 
CB204418 

CB2 04418. 1 GI: 28241848 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 896) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 
Tissue Procurement: Dr. David Rowe 
cDNA Library Preparation: Invitrogen Corp 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / / image . llnl . gov 
Plate: NDAM0041 row: k column: 11 
High quality sequence stop: 686. 
Location/ Qualifiers 
1. .896 

/organism="Mus musculus" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone=" IMAGE: 30138586" 
/lab_host="DH10B (phage-resis tant ) " 
/clone lib="NIH MGC 135" 



/note="Vector: pCMVSport6 . 1 ; Site_l: EcoRV; Site_2: NotI; 
Normalized full-length enriched library from pooled mouse 
embryonic limb, maxilla and mandible, day 12.5, 13.5, 
14.5, and 15.5 (size selected for the 0.5-1 kb fragments) 
Cloned directionally, priming method: Oligo-dT. cDNA 
enrichment: >lk bp, Average insert size 1.6k bp. 
Normalization (Cot value): 7.5 kb. Priming sequence: 
5 1 GACTAGTTCTAGATCGCGAGCGGCCGCCC (T) 3 1 Tissue contributed 
by, David Rowe. Library constructed by ResGen, Invitrogen 
Corp. " 



ORIGIN 



Query Match 19.9%; Score 745; DB 14; Length 896; 

Best Local Similarity 93.0%; Pred. No. 1.4e-99; 

Matches 816; Conservative 0; Mismatches 50; Indels 11; Gaps 3; 

Qy 2745 AACATTTT CAGATT CAT CT CC GATT GAGATAATAGAT GAATT T C C CACGT TT GT CAGT GC 2804 

I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I II I I I II I II I I I I 
Db 13 AACAT T T T C C GATT CAT CT C C CAT T GAGATAATAGAT GAGT T T C C C AC AT T T GT C AGT GC 72 

Qy 2 8 05 T AAAGAT GAT TCT C CTAAAT T AGC CAAG GAGT AC ACT GAT CT AGAAGT AT C C GAC AAAAG 2864 

I I I I I I I M II I I I I I I I I I M I I I I I I I I M I I I I I I I I I I II II I I I 

Db 73 T AAAGAT GAT T CT C CT AAGGAGT ACACT GACCT AGAAGT AT C CAACAAAAG 123 

Qy 2865 T GAAAT T GC T AAT AT C CAAAGC G GGGC AGAT T CAT T GC C TT GCT T AGAAT T GC C C T GT GA 2 924 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 124 TGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGA 183 

Qy 2 925 CCTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAGT ACAT GTT T C AGAT GAAT T CT C C GA 2 984 

I I I I I I I II I I I I I I 1 I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 184 CCTTTCTTT CAAGAAT AC AT AT C CT AAAGAT GAAGC AC AT GT C T C AGAT GAAT T CT C C AA 243 

Qy 2985 AAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCTTTGGAACC 3044 

II I I I II I I I I I I I I I I I I I I I II III II I I I I I I I I I I I I I I I I I II I I 

Db 244 AAGTAGGT C CAGT GTATCTAAGGTGCCCT TAT T GCT TCCAAATGTTTCTGCTTTGGAATC 303 

Qy 3045 T CAGACAGAAAT GGGCAGCATAGTTAAAT CCAAAT CACTT ACGAAAGAAGCAGAGAAAAA 3104 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 304 T CAAAT AGAAAT GGG CAACAT AGT TAAAC C CAAAGT ACT T AC GAAAGAAGCAGAGGAAAA 3 63 

Qy 3105 ACTT C CTT CT GACACAGAGAAAGAG GAC AGAT CC CT GT CAGCT GTATT GT CAGCAGAGCT 3164 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II 

Db 364 AC TTCCTTCT GAT ACAGAGAAAGAG GACAGAT C C CT GAC AGCT GT AT T GT CAG CAGAG CT 423 

Qy 3165 GAGT AAAACT T CAGTT GT T GAC CT C CT CT ACT G GAGAGAC ATT AAGAAGACT G GAGT G GT 3224 

II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I 

Db 42 4 GAAT AAAACT T CAGTT GT T GAC CT C CT GT ACT G GAG AGAC ATT AAGAAGACT GGAGT GGT 4 83 

Qy 322 5 GTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAAC 32 84 

I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 484 GTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAAC 543 

Qy 32 8 5 GGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGT 334 4 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 544 GGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATATATAAGGGTGT 603 



QY 



3345 GAT CCAGGCTAT CCAGAAAT CAGAT GAAGGC CACCCATT CAGGGCATAT TTAGAAT CT GA 3404 



Db 604 GAT C CAAG CT AT C C AGAAAT CAGAT GAAGGC CAC C CAT T CAG GGCAT AT T T G GAAT CT G A 663 

Qy 3405 AGTT GC T AT AT C AGAG GAAT T GGT T C AGAAAT AC AGTAAT T CT GCT CT T GGT CAT GT GAA 3464 

MUM I I I I I M M M M M M M M M M I I M M M M M M M I M M M M M 

Db 664 AGTT GC CATAT CAGAGGAATT GGTT CAGAAAT AT AGTAAT T CT GCT CTT GGT CATGTGAA 723 

Qy 3465 C AGC AC AAT AAAAGAAC T GAG GCGGCTTT T CT T AGT T GAT GAT T T AGT T GAT T C C CT GAA 3524 

M M M M M M M M M M M I M M M M M M M M M M M M M M M M M 

Db 724 CAGCACAAT AAAAGAAT T GAGG C GT CT C T T CTT AGT T GAT GAT T TAGT T GAT T C C CT GAA 783 

Qy 3525 GTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCC-TTGTTCAATGGTCTGACAC 3583 

I M M M M M M M M M I M I M M I M M M M I M M M M M I I I I I I I I 

Db 784 G-TTGCAGTGTTGATGTGGGTATTTACTTACGTTGGTGCCTTTGTTCAATGGTTTGACAC 842 

Qy 3584 TACT GATTTTAGCTCT GAT CTCACTCTTCAGTATTCC 362 0 

M I M M I M M I M M M M M M M M M M M I 

Db 8 43 TACT GAT T TT AGC C C T GAT C T CACT CTT CAGT ATT CC 879 



RESULT 5 

CA504729/c 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
COMMENT 



CA504729 796 bp mRNA linear EST 14-NOV-2002 

UI-R-FJ0-cpx-e-15-0-UI. si UI-R-FJO Rattus norvegicus cDNA clone 
UI-R-FJ0-cpx-e-15-0-UI 3', mRNA sequence. 
CA504729 

CA504729. 1 GI : 24 9 95683 
EST. 

Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 796) 

Bonaldo, M. F. , Lennon,G. and Soares, M.B. 

Normalization and subtraction: two approaches to facilitate gene 
discovery 

Genome Res. 6 (9), 791-806 (1996) 

97044477 

8889548 

Contact: Soares, MB 

Coordinated Laboratory for Computational Genomics 
University of Iowa 

375 Newton Road , 4156 MEBRF, Iowa City, IA 52242, USA 

Tel: 319 335 8250 

Fax: 319 335 9565 

Email: bento-soares@uiowa.edu 

Tissue Procurement: Dr. James Lin, Universtiy of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Researchers may obtain clones from Research 

Genetics (www.resgen.com). 
The following repetitive elements were found in this cDNA 

sequence: 1-35, >POLY_A#Simple_repeat (matched compliment) 

Seq primer: Ml 3 FORWARD 

P0LYA=Yes . 



FEATURES Location/Qualifiers 
source 1. .796 



# TAG_TISSUE=rat-embryo 
L> / TAG_LIB=UI-R-FJO 

TAG SEQ=CATCTCTACT" 



/organism="Rattus norvegicus" 
/mol_type="mRNA M 
/strain="Sprague-Dawley" 
/db_xref="taxon: 10116" 
/clone="UI-R-FJ0-cpx-e~15-0-UI" 
/tissue_type=" embryo" 
/dev_stage="embryo " 

/lab_host="DH10B (Life Technologies) (Tl phage resistant)" 
/clone_lib= f, UI-R-FJO" 

/note="Vector: pYX-Asc; Site_l: EcoR I; Site_2: Not I; 
UI-R-FJO is a cDNA library containing the following 
tissue(s): rat embryo. The library was constructed 
according to Bonaldo, Lennon and Soares, Genome Research, 
6:791-806, 1996. First strand cDNA synthesis was primed 
with an oligo-dT primer containing a Not I site. Double 
stranded cDNA was ligated to an EcoR I adaptor, digested 
with Not I, and cloned directionally into pT7T3-Pac 
vector. The oligonucleotide used to prime the synthesis of 
first-strand cDNA contains a library tag sequence that is 
located between the Not I site and the (dT)18 tail. The 
sequence tag for this library is CATCTCTACT . This library 
was created for the University of Iowa Program for Rat 
Gene Discovery and Mapping (Val Sheffield, Bento Soares 
and Tom Casavant) 



ORIGIN 

V -\ 0 Quer ^ Match 19.4%; Score 725.6; DB 14; Length 796; 

' - V Best Local Similarity 99.3%; Pred. No. 9.9e-97; 

J) / Matches 728; Conservative 0; Mismatches 5; Indels 0; Gaps 0; 

^ Qy 952 T T TAAAGAAC AT GGAT AC CT T GGT AAC T TAT CAG CAGT GT C AT CCT C AGAAGGAACAAT T 1011 

I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I || I M I I I I I I I I I I I I 
Db 733 T CTAAAGAACAT GGAT ACCTT GGTAACTT AT CAGCAGT GT CAT CCT CAGAAGGAACAATT 674 

Qy 1012 gaagat^c/tttaaatgaagcttctaaagagttgccagagagggcaacaaatccatttgta 1071 
I I I M I I [\ I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 673 GAAGAAACTNJAAAT GAAGC T T C T AAAGAGT T GC C AGAGAGGGCAACAAAT C CAT T T GT A 614 

Qy 1072 AAT AGAGAT T T AGCAGAATT T T C AGAAT T AGAAT AT T CAGAAAT GGGAT CAT C T T T T AAA 1131 

I I I I M I I I I I I I I I I I I I I I I I || I I I | | | | | | | | | | | M | | | | | | || | | | | | | | || | | 
Db 613 AAT AG AG AT T TAG CAG AAT T T T CAG AAT T AGAAT AT T CAGAAAT G G GAT CAT C T T T T AAA 554 

Qy 1132 GGCT C C C C AAAAG GAGAGT C AGC CAT AT T AGT AGAAAAC AC T AAGGAAGAAGT AAT T GT G 1191 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I J I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 553 G G C T C C C CAAAAG GAGAGT CAG C CAT AT T AGT AGAAAAC AC T AAG GAAGAAGT AAT T GT G 4 94 

\ 

Qy 1192 AGGAGTAAAGAC AAAGAG GATT T AGT T T GT AGT GC AG C C CT T CAC AGT C C ACAAGAAT C A 1251 

I I I I I I I I I II I I I I I I I I I II I I I II I I I I I I I I I II I I I I I II I I I I I I || | I | I I I I 
Db 493 AG GAGT AAAGACAAAGAGGATT T AGT T T GT AGT G CAGC C C T T CACAGT C C ACAAGAAT C A 434 



Qy 



1252 C CT GT G G GT AAAGAAGACAGAGT TGTGTCTC CAGAAAAGAC AAT G GAC ATT T T T AAT GAA 1311 
I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I 



Db 



433 CCT GT GGGT AAAGAAGAC AGAGT T GT GT C T C CAGAAAAGACAAT GGACAT TT TTAAT GAA 374 



Qy 1312 AT GC AGAT GT CAGT AGT AG C AC C T GT GAG GGAAGAGT AT G C AGAC T T TAAGC CAT T T GAA 1371 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | 
Db 373 AT G C AGAT GT CAGT AGT AG C AC C T GT GAG GGAAGAGT AT GC AGAC T T TAAGC C ATT T GAA 314 

Qy 1372 CAAGCATGGGAAGTGAAAGATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT 1431 

M II I I I I I I I I I I I I I I I 1 II I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | M | | | | | | 
Db 313 CAAG CAT G GGAAGT GAAAGATACT T AT GAGG GAAGT AGGGAT GTGCTGGCT GC T AGAGCT 254 

Qy 1432 AAT GT G GAAAGTAAAGT GGAC AGAAAAT GCT T GGAAGAT AGC CT GGAG CAAAAAAGT CTT 1491 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I | | | | | | | | 
Db 253 AAT GT G GAAAGTAAAGT GGAC AGAAAAT GCT T GGAAGAT AGC CT GGAG CAAAAAAGT CTT 194 

Qy 1492 GG GAAGG AT AGT GAAG G C AGAAAT GAG GAT GCTTCTT T C C C CAGT AC C C C AGAAC CT GT G 1551 

I M I I I I I I M I I I I I I I I I I II I I I I II I I I II I I I I I I I I I I I I I I I I I I || I || | | | 
Db 193 GG GAAGGAT AGT GAAG G C AGAAAT G AGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT G 134 

Qy 1552 AAGGAC AG C T C CAGAG CAT AT AT T AC CT GT G CT T C CT TT AC CT C AGCAAC C GAAAGCAC C 1611 

M I I I I I I I I I I I I I I I I M I I I I I I II I I I I I II II I I || | | | | || | || | | || | | | | || 
Db 133 AAGGAC AG CT C CAGAGCAT ATAT T AC CT GT GCT T C CT TT AC C T C AGCAAC C GAAAGCAC C 7 4 

Qy 1612 ACAGCAAACACT TT CCCTTT GTT AGAAGAT CATACTTCAGAAAATAAAACAGAT GAAAAA 1671 

I M I I I I I II I I I I II I I I I I I I I I I || | | | | | | | | | | | | | | | | | | | | | | | | | | | 

Db 73 ACAGCAAACACTTT CC CTTT GTTAGAAGAT CATACTTCAGAAAATAAAACAGAT GAAAAA 14 

Qy 1672 AAAATAGAAGAAA 1684 

I I I I I II III 
Db 13 AAAAAAAAAAAAA 1 



RESULT 6 
BI730192 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



Crania ta ; Vertebrata ; Euteleos tomi ; 
Sciurognathi; Muridae; Murinae; Mus . 



BI730192 805 bp mRNA linear EST 20-SEP-2001 

603349739F1 NIH_MGC_94 Mus musculus cDNA clone IMAGE : 5357385 5 f , 
mRNA sequence. 
BI730192 

BI730192. 1 GI: 15707205 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; 
Mammalia; Eutheria; Rodentia; 
1 (bases 1 to 805) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r @mail . nih . gov 
Tissue Procurement: The Cepko Laboratory 

cDNA Library Preparation: Life Technologies , Inc. 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Incyte Genomics, Inc. 

Clone distribution: MGC clone distribution information can be 
found through the I.M.A.G.E. Consortium/LLNL at: 
http : / / image . llnl . gov 
Plate: LLAM11908 row: n column: 10 



High quality sequence stop: 802. 
FEATURES Location/Qualifiers 
source 1. .805 

/organism="Mus mus cuius" 

/mol^ype^'mRNA" 

/db_xref="taxon: 10090" 

/clone=" IMAGE: 5357385" 

/tissue_type=" retina" 

/lab_host="DH10B (phage-resistant ) " 

/clone_lib="NIH_MGC_94" 

/note="Organ: eye; Vector: pCMV-SPORT6; Site_l: NotI; 
Site_2: Sail; Cloned unidirectionally ; oligo-dT primed. 
Average insert size 3.3 kb. Library enriched for 
full-length clones and constructed by Life Technologies. 
Note: this is a NIH_MGC Library." 

ORIGIN 

Query Match 19.0%; Score 709.8; DB 12; Length 805; 

Best Local Similarity 93.8%; Pred. No. 2.1e-94; 

Matches 751; Conservative 0; Mismatches 47; Indels 3; Gaps 1; 

Qy 1854 AGCAT GT GAAAGT GAAC T GAAT GAAG C C AC AGGT AC AAAGAT T GCT T AT GAAACAAAAGT 1913 

I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | M I I 
Db 1 AGCATGTGAAAGTGAACTGAACGAAGCCACAGGTACAAAGATTGCTTATGAAACAAAAGT 60 

Qy 1914 G GACT T G GT C C AAAC AT C AGAAG CT AT ACAAGAAT C AC T TT AC C C CAC AG C ACAGCT TT G 1973 

I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I 11 I I M I I I 
Db 61 GGACTT GGT C CAGACAT CAGAAGCT ATACAAGAGT CAATTT AC CCCACAGCACAGCTTT G 12 0 

Qy 197 4 C C CAT CAT T T GAG GAAGC T GAAG CAACT C C GT CAC C AGT T T T GC CT GAT ATT GT TAT GGA 2033 

I I M I I I I I I II I I I I I I I I I I M I I I I I I || I I I I I || | || | || I I 1 I II I I I I I II I I 
Db 121 C C CAT CAT T T GAGGAAG CT GAAGCAACT C C GT CAC C AGT TT T G C CT GAT ATT GT TAT GGA 180 

Qy 2034 AGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATC 2 093 

Ml M M I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I III 
Db 181 AGCGCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAGTGCATC 240 

Qy 2 094 C C CACT GGAAGCACCT C CT C CAGT T AGT TAT GAC AGT AT AAAGC T T GAG C CT GAAAAC C C 2153 

I I I I M I I I I Ml I II I I I I I I I I I I I I I I I I II I I I I I I || I I || I | I || I || 
Db 241 C C CAC T AGAAG T AC C GT CT C CAGT T AGT TAT GAC GGT AT AAAGCT T GAGC CT GAAAAT C C 30 0 

Qy 2154 C C CAC CAT AT GAAGAAG C CAT GAAT G TAG C AC T AAAAG C T T T G G GAAC AAAG GAAG GAAT 2213 

I I I I I I I I I I I II M I I II I I I I I I M I II I I I I I I I I II I I I I I I I II III 
Db 301 C C CAC CAT AT GAAGAAGC C AT GAGT GT AGC ACTAAAAACAT C GGAC T CAAAGGAAGAAAT 360 

Qy 2214 AAAAGAGCCTGAAAGTTTTAATGCAGCTGTTCAGGAAACAGAAGCTCCTTATATATCCAT 2273 

I I I I I M II I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I | I I II 
Db 361 TAAAGAGC C T GAAAGT T T T AAT G C AGCT G CT CAGGAAGC AGAAG C T C CT TAT AT AT CC AT 420 

Qy 2274 T GC GT GT GAT T T AATTAAAGAAACAAAGCT C T C CACT GAGC C AAGT C C AGAT T T CT CT AA 2333 

Ml M I II I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 421 TGCATGT GAT TTAATTAAAGAAACAAAGCTCTC CACT GAGC CAAGTCCAGAGTTCTCTAA 4 80 

Qy 2334 TT AT T CAGAAAT AGCAAAAT T C GAGAAGT C GGT GC C C GAAC AC GCT GAG CT AGT G GAG GA 2393 

I II I I I I I I I I I I I I II I I II II I I I I I I I I I I I I II III I I I I I I I I I I I II 
Db 481 T TAT T CAGAAAT AGCAAAAT T T GAGAAGT CGGTGCCT GAT CAC T GT GAGCT C GT G GAT G A 540 



Qy 2394 T T C CT CAC C T GAAT C T GAAC C AGT T GACT T AT TT AGT GAT GAT T C GAT T C CT GAAGT C C C 2453 

I I M I I I I I I I I I I I I I i II I I I I I I I i I I I I I I | I II I I I I I I I | I I I I I I I I M I I 
Db 541 T T C C T CAC C C GAAT CT GAAC C AGT T GACT TAT TTAGT GAT GAT T CAAT T C CT GAAGT C C C 600 

Qy 2454 ACAAAC ACAAGAGGAGG C T GT GAT G CT CAT GAAG GAGAGT CT CACT GAAGT GT CT GAGAC 2513 

I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 601 ACAAACACAAGAG GAGG CT GT GAT GCTAAT GAAGGAGAGT CT CACT GAAGT GT CT GAGAC 660 

Qy 2514 AGTAGCCCAGCACAAA GAG G AGAGAC T T AGT G C CT CAC C T CAGGAGC T AG GAAAG C C 257 0 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II | | | | 

Db 661 AGTAACACAAC ACAAAC AT AAGGAGAGACT T AGT GCT T CAC C T C AGGAGGT AGGAAAG C C 720 

Qy 2571 AT AT TT AGAGT CT T T T C AG C C CAAT T T AC ATAGT ACAAAAGAT GCT G CAT CT AAT GAC AT 2630 

I I M I I II I I I I I I I I I I I I I I I II I I I I I I I IN | | M | || | M | | | | | | Ml M 
Db 721 AT AT TT AGAGT CT T T T C AGC C C AATT T AC AT AT T AC CAAAGAT GCT GC AT C T ACT GAAAT 780 

Qy 2631 T C C AAC AT T GAC C AAAAAG G A 2651 

I I I I I I I I I I I I I I I II I I I I 
Db 781 T C C AAC AT T GAC C AAAAAG G A 801 



RESULT 7 
CB521332 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CB521332 822 bp mRNA linear EST 09-JUL-2003 

UI-M-GH0-cem-h-13-0-UI.rl N I H_BMAP_GH 0 Mus musculus cDNA clone 
IMAGE: 6841502 5 1 , mRNA sequence. 
CB521332 

CB521332 . 1 GI : 2 9354687 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 822) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : / / genome . uiowa . edu/ distribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualif iers 
1. .822 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6" 

/db_xref="taxon: 10090" 

/ cl one= " IMAGE :6841502" 

/tissue_type="Whole brain" 

/dev_stage="l, 5, and 15 days newborn" 



/lab_host="DH10B (Tl phage resistant)" 
/ c 1 o n e_l i b = " N I H_BMAP_GH 0 " 

/note="Organ: Brain; Vector: pYX- Asc; Site__l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is CGAACTGAAT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System 1 , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 

ORIGIN 



Query Match 19.0%; Score 709.4; DB 14; Length 822; 

Best Local Similarity 92.6%; Pred. No. 2.4e-94; 

Matches 771; Conservative 0; Mismatches 51; Indels 11; Gaps 2; 



Qy 


2762 


CT C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GT TT GT CAGT GC T AAAGAT GAT T CT C C T A 


2821 






1 M 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 1 




Db 


1 


CTCCCATTGAGATAATAGATGAGTTTCCCACATTTGTCAGTGCTAAAGATGATTCTCCT- 


59 


Qy 


2822 


AAT T AGC CAAGGAGT ACAC T GAT CT AGAAGT AT C C GACAAAAGT GAAAT T GCTAAT AT C C 


2881 






1 1 1 I 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 II 1 III 




Db 


60 


AAGGAGTACACT GAC CT AG AAGT AT C C AAC AAAAGT GAAAT T GCTAAT GT C C 


111 


Qy 


2882 


AAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAATA 


2941 






1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 I I I I I I I I | | | | | | || | | | | | | | | | | || 




Db 


112 


AGAGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGACCTTTCTTTCAAGAATA 


171 


Qy 


2942 


TAT AT C CTAAAGAT GAAGTACAT GTTT CAGAT GAATT CT CCGAAAATAGGT CCAGT GT AT 


3001 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 1 II 




Db 


172 


CAT AT C CTAAAGAT GAAGCAC AT GTCT CAGAT GAATT CT CCAAAAGTAGGT CCAGT GTAT 


231 


Qy 


3002 


C T AAGGCAT C C AT ATC GC C T T CAAAT GTCTCTGCTTTG GAACC T C AGACAGAAAT GGGC A 


3061 






1 1 1 1 1 1 M III II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 1 I M 




Db 


232 


CTAAGGTGCCCTTATTGCTTCCAAATGTTTCTGCTTTGGAATCTCAAATAGAAATGGGCA 


291 


Qy 


3062 


GCATAGT TAAAT CCAAAT CACTTACGAAAGAAGCAGAGAAAAAACTT CCT T CT GACACAG 


3121 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


292 


AC AT AGT T AAAC C CAAAGT ACT T AC GAAAGAAG CAGAG GAAAAAC TT C CT T CT GAT AC AG 


351 


Qy 


3122 


AGAAAGAG GAC AGAT C C C T GT C AGCT GTAT T GT CAG C AGAGCT GAGT AAAACT T C AGT T G 


3181 






1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II II 




Db 


352 


AGAAAGAG GAC AGAT C C CT GAC AG C T GTAT T GT CAGC AGAGCT GAAT AAAACTT CAGT T G 


411 


Qy 


3182 


T T GAC CT C CT CT AC TGGAG AGAC AT T AAGAAGACT GGAGT G GT GT TT G GT GC CAG CT T AT 


3241 






1 1 1 M 1 1 1 II 1 1 1 II 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


412 


TTGACCTCCTGTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTAT 


471 


Qy 


3242 


T CCT GCTGCTGTCTCTGACAGTGTTCAGCATTGT CAGT GTAACGGCCTACATTGCCTTGG 


3301 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 II 1 1 1 1 1 1 1 1 





Db 



4 72 TCCTGCTGCTGTCTCTGACAGTGTT — TCATTGTCAGTGTAACGGCCTACATTGCCTTGG 52 9 



Qy 3302 CCCTGCTCTCGGT GACT AT CAGCT T T AGGATAT AT AAGG GC GT GAT C C AGG C TAT C C AGA 3361 

I I I I I I I I I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I || I I 
Db 53 0 CCCTGCTCTCTGT GAC TAT C AG CT T TAG GAT AT AT AAG GGT GT GAT C C AAGCT AT C C AGA 589 

Qy 3362 AAT CAGAT GAAGGC CAC CCAT T CAGGGCAT ATTTAGAAT CT GAAGTT GCTAT AT CAGAGG 3421 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 590 AAT CAGAT GAAGG C CAC C CAT T C AG GGC AT AT T T G GAAT CT GAAGT T GC CAT AT C AGAG G 649 

Qy 3422 AAT T G GT T C AGAAAT ACAGTAAT TCTGCTCTTGGT CAT GT GAAC AG C ACAAT AAAAG AAC 34 81 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I II I I I I I I I I I I I I I I 
Db 650 AATT GGT T CAGAAATATAGTAATT CT GCT CTT GGT CAT GT GAACAGCACAATAAAAGAAT 7 09 

Qy 3482 TGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATGT 3541 

I I I I I I I II I I I I I II I I I I I I I I I I I II M I I I I I I I I I I I I I I I II I II I I I I I I 
Db 710 T GAGGC GT CT C T T CT T AGT T GAT GACT T AGT T GAT T C C C T GAAGT T T G C AGT GT T GAT GT 769 

Qy 3542 GGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTA 3594 

I I I I I I I I II I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I MM 
Db 770 GGGTATTTACTTACGTTGGTGCCTTGTTCAATGGTTTGACACTACTGACTTTA 822 



RESULT 8 
BU841009 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



BU841009 986 bp mRNA linear EST 16-OCT-2002 

AGENCOURTJ. 0187 690 NIHJVIGC_134 Mus musculus cDNA clone 
IMAGE: 6518816 5', mRNA sequence. 
BU841009 

BU8 41009. 1 GI : 24025409 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chorciata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 986) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 
Tissue Procurement: Dr. David Rowe 
cDNA Library Preparation: Invitrogen Corp 

cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 

DNA Sequencing by: Agencourt Bioscience Corporation 

Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 



FEATURES 

source 



Plate: LLAM14101 row: c column: 
High quality sequence start: 21 
High quality sequence stop: 644. 

Location/Qualifiers 

1. .986 

/organism="Mus musculus" 
/mol_type= M mRNA" 
/db_xref="taxon: 10090" 
/ cl one= " IMAGE :6518816 n 



09 



/ tissue_type="undif f erentiated limb" 
/lab_host="DH10B (phage-resistant ) " 
/clone_lib="NIH_MGC_134 M 

/note="Vector: pCMV-SP0RT6 . 1 ; Site_l: EcoRV; Site__2: NotI; 
Cloned unidirectionally . Primer: Oligo dT . Average insert 
size 1.7 kb. Constructed by ResGen, Invitrogen Corp. Note: 
this is a NIH_MGC Library." 

ORIGIN 

Query Match 18.9%; Score 707.8; DB 13; Length 986; 

Best Local Similarity 87.8%; Pred. No. 4.1e-94; 

Matches 832; Conservative 0; Mismatches 103; Indels 13; Gaps 5; 

Qy 172 8 AAAT CCTTTCCTT GT AG CAGT AC AGGAT T CT GAGGC AGAT TAT GT TACAACAGAT AC CT T 1787 

I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II 
Db 2 9 AAAT CCTTTCCTT GT AGCAAT AC AT GAT T C T GAGG C AGAT TAT GT CACAAC AGAT AAT T T 8 8 

Qy 1788 AT CAAAGGTGACTGAGGCAGCAGT GT CAAACAT GCCT GAAGGT CT GAC GCCAGATTT AGT 1847 

I I M I I I I M I I I i I M I I I Mil III II I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 8 9 AT CAAAGGTGACTGAGGCAGTAGTGGCAAC CAT GCCT GAAGGT CTAAC GCCAGATTT AGT 14 8 

Qy 1848 T C AGGAAGCAT GTGAAAGT GAAC T GAAT GAAG C CACAGGT ACAAAGATT GCT T AT GAAAC 1907 

I I I M I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I 
Db 14 9 T CAG GAAG CAT GT GAAAGT GAACT GAAC GAAGC CACAGGT ACAAAGAT T GC T TAT GAAAC 208 

Qy 1908 AAAAGT GGACT T GGT C CAAACAT C AGAAGCT AT AC AAGAAT CACT T T AC C C C AC AGC AC A 1967 

I I M I I I I I M I I I I I I I I I I I I II II I I I II I I (I II III I I I I I I I I I I I I I II I 
Db 2 09 AAAAGT GGACT T GGT C CAGAC AT C AGAAG CT AT ACAAGAGT C AAT TT AC C C C ACAG C AC A 2 68 

Qy 1968 GCTTTGCC CAT CAT TT GAGGAAG CT GAAGC AACT C C GT C AC CAGT T TT GC CT GAT ATT GT 2027 

I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 269 GCTTTGCC CAT C AT TT GAG GAAGC T GAAGCAACT C C GT CAC CAGT T T T G C CT GAT AT T GT 328 

Qy 2028 TATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAG 2 087 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 32 9 TATGGAAGCGCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAG 388 

Qy 2088 T GT AT C C C CACT GGAAGCAC CT C CT C C AGT T AGT TAT GAC AGT AT AAAGC T T GAGC CT GA 2147 

M I I I I I I I I I I I I I III I I I I II I I I I I II II I I I I I I I I I I I II I I I I I I I I 
Db 38 9 T G CAT C C C CAC T AGAAGT AC C GT CT C CAGT T AGT TAT GAC GGT ATAAAG CT T GAGC CT GA 448 

Qy 214 8 AAAC C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AGCAC TAAAAGC TT T GG GAACAAAGG A 2207 

III I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 44 9 AAAT C C C C CAC CAT AT GAAGAAGC C AT GAGT GT AG C ACT AAAAAC AT C GGAC GCAAAGGA 508 

Qy 2208 AG GAATAAAAGAGC CT GAAAGT TT T AAT GC AGCT GT T CAGGAAAC AGAAG CT C C T TAT AT 2267 

II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 509 AGAAATTAAAGAGC C T GAN AGT TT T AAT G C AGCT GCT CAG GAAGC AGAAGC T C CT TAT AT 568 

Qy 2268 AT C CAT T GC GT GT GAT T T AATT AAAGAAACAAAGCT C T C CACT GAGC CAAGT C CAGAT T T 2327 

I I I I I I I I I I I I I II I I I I I II I I I I I I I I II i I I I I I I I I I I I I I I I I I I I I I M II 
Db 569 AT C CAT T G CAT GT GAT T TAATTAAAGAAACAAAGCT CT C CAC T GAGC CAAGT C C AGAGT T 628 

Qy 2328 CTCT AATT ATT CAGAAATAGCAAAAT T CGAGAAGTCGGTGCCC GAAC AC GCT GAGC T AGT 2 38 7 

I I I I I I I I II I I II I I I I I I I I I II I I I I I I I I I I II I I II II III I I I I I I II 
Db 62 9 CTCTAATTATTCAGAAATAGCAAAATTTGAGAAGTC GGT GCCT GAT CACT GTGAGCTCGT 688 



Qy 2388 GGAGGAT T C CT C AC CT GAAT C T GAAC C AGTT GACT T AT T T AGT GAT GAT T C GAT T C CT GA 2447 

III I I I I I I I I I I I I I ! I I I I I I I I || I I I I I I II I I I I I | I || | | I | I || | | | | | 
Db 689 G GAT GAT T C CT C AC C C GAAT C T GAAC C AGT T GAC T TAT T T AGT GAT GAT T C AAT C C CT G A 74 8 

Qy 244 8 AGT CCCACAAAC ACAAGAGGAGGCT GT GATGCT CAT GAAGGAGAGTCT CACT GAAGT GTC 2507 

I I I I I I I I I I I I I II II II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 749 AGT C C C AC CAAC AC AAGAGGAAG CT GT GAT G CT T AT GAAAGAGAGT C T C AC CT GAAT GT C 8 08 

Qy 2508 T GAGAC AGT AG C C C AGC ACAAAGAG GAGAGACT T AGT G C CT C AC C T C AGGAG 2559 

I I I M I I I I I I I I I I I I I I I I I I I I II I I M I I I I I III 

Db 8 09 T GAGAC AGTT AC C C C AC C C CAAC AT AAG GGAGAGACT T AGT GCTTTCCCCTCCG GAGGGT 8 68 

Qy 2560 C T AGGAAAGC C AT AT T T AGAGT CT T T T — CAGCCCAATTTACATAGT — ACAAAAGATGC 2615 

I I M I I I I I I I I II I I I I II I I I I II I I I I I I I I I I I II 

Db 8 69 AGAAAAGGC C CT AT T T T AGAGT CT T T T T C AG C C CCAATT T AC C TAT T T AC CAAAGGAT GC 928 

Qy 2616 T GC AT - C T AAT GAC AT T C CAAC AT T GAC CAAAAAGGAGAAAAT T T CT T 2662 

I I I I M I I I I I I I I I I I I I I I I I I I I III 
Db 929 TGCCTCCTAATGAAAATTCCACCTTTGGCCCAAAAGGGAGACCATTTT 976 



RESULT 9 
CF948588 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CF948588 772 bp mRNA linear EST 20-NOV-2003 

UI-M-HJ0-cmt-g-13-0-UI.rl NIHJBMAP_HJ0 Mus musculus cDNA clone 
IMAGE: 30632124 5', mRNA sequence. 
CF948588 

CF94 858 8. 1 GI : 384 64457 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 772) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. James Lin University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : / / genome . uiowa . edu/ dis tribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .772 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6 M 

/ db_x r e f = " t axon * 10090" 

/clone=" IMAGE: 30632124" 

/tissue_type="Upper Head" 

/dev_stage="9.5 and 10.5 dpc" 



/lab_host="DH10B (Tl phage resistant) " 
/clone_lib="NIH_BMAP_HJO" 

/note="Organ: Head; Vector: pYX-Asc; Site__l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site .Double strand cDNA was 
size selected according to mRNA size fraction , ligated 
with EcoR I adaptor , digested with NotI and then cloned 
directionally into pYX-Asc vector . The library tag 
sequence located between the Not I site and the polyA tail 
is CGAACTGAAT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System' , supported by National 
Institute of Mental Health (NIMH)." 

ORIGIN 

Query Match 18.1%; Score 678.4; DB 14; Length 772; 

Best Local Similarity 92.8%; Pred. No. 8.6e-90; 

Matches 725; Conservative 0; Mismatches 47; Indels 9; Gaps 1; 

Qy 2 64 4 AAAAAGGAGAAAAT T T C T T T G CAAAT GGAAGAGT T T AAT ACT GCAAT T TAT T CAAAT GAT 2703 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I II I I I II I I I II I I I I 
Db 1 AAAAAG G AGACAAT T T CT T T GCAAAT G GAAGAGT T TAAT ACT G CAAT T TAT T C CAAT GAT 60 

Qy 2704 GAC T T ACT T T CT T CT AAGGAAGACAAAATAAAAGAAAGT GAAAC AT T T T C AGAT T CAT CT 2763 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I II I I 

Db 61 GACT T AC T T T CT T CTAAGGAAGACAAAAT GAAAGAAAGT GAAAC ATT T T C C GAT T CAT C T 12 0 

Qy 2764 C CGAT T GAGAT AAT AGAT GAAT T T C C CAC GT TT GT CAGT G CTAAAGAT GAT T CT C CT AAA 2823 

II I I I I M I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 121 C CC ATT GAGAT AAT AGAT GAGT T T C C CAC AT T T GT CAGT G C T AAAGAT GAT T CT C C T 17 7 

Qy 2824 T T AGC CAAGGAGT AC ACT GAT CT AGAAGT AT CC GACAAAAGT GAAAT T GC TAAT AT C CAA 2883 

I I I II I II I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I 
Db 178 AAGGAGT ACAC T GAC CT AGAAGT AT C CAACAAAAGT GAAAT T G CT AAT GT C C AG 231 

Qy 2884 AGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAATATA 294 3 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I M I I I I I I I I I I I I I 
Db 232 AGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGACCTTTCTTTCAAGAATACA 291 

Qy 2944 TAT C CTAAAGAT GAAGT AC AT GT T T C AG AT GAAT T C T C C GAAAAT AG GT C CAGT GT AT C T 3003 

I M I I I I I I I I I I I I I III! I I I I I I I I I I I I I I I II! I I I I I I I I I I I I III 
Db 292 TATCCTAAAGATGAAGCACATGTCTCAGATGAATTCTCCAAAAGTAGGTCCAGTGTGTCT 351 

Qy 3004 AAGGCATCCATATCGCCTT CAAAT GTCTCTGCTTTGGAACCTCAGACAGAAATGGGC AGC 3063 

I I I I II III I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I 
Db 352 AAGGTGCCCTTATTGCTTCCAAATGTTTCTGCTTTGGAATCTCAAATAGAAATGGGCAAC 411 

Qy 3064 AT AGT T AAAT C CAAAT CACT T AC GAAAGAAG CAGAGAAAAAACT T C C T T C T GAC ACAGAG 3123 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I MINI 
Db 412 AT AGT T AAAC C CAAAGT ACT T AC GAAAGAAGCAGAGGAAAAACT T C CT T CT GAT ACAGAG 471 

Qy 3124 AAAGAG G ACAGAT C C C T GT CAG C T GT AT T GT CAGC AGAG C T GAGT AAAACT T CAGT T GT T 3183 

I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I II I I I I I II I I I I I I I I I 
Db 472 AAAGAG GAC AGAT C C CT GAC AG C T GT AT T GT CAG C AGAG CT GAAT AAAAC T T CAGT T GT T 531 



Qy 3184 GACCT C CT CT ACT GGAGAG AC AT TAAGAAGACT GGAGT GGT GT T T G GT G C C AGC T TAT T C 3243 

I I I I I I II II I I I I I I I I I I 1 II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 532 GAC C T C CT GT ACT G GAGAGAC AT TAAGAAGACT GGAGT GGT GT T T GGT GC C AGCT T AT T C 591 

Qy 3244 CTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCC 3303 

II I I I I M I I I I I I I I I I II I I I I I I I I I I I I || I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 592 CTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCC 651 

Qy 3304 CTGCTCTCGGT GAC TAT C AG CT T TAG GAT AT AT AAGGGC GT GAT C C AGGC T AT C C AGAAA 3363 

I I I I I I I I I I I I I I I I I II I M II II I I I II I II I I I II I I I I I I I I I I I I I I I I I I 
Db 652 CTGCTCTCTGT GAC TAT CAGC T T T AGGAT AT ATAAGG GT GT GAT C CAN G CT AT C C AGAAA 711 

Qy 3364 T CAGAT GAAGGC CAC CCATT CAGGGCATATTTAGAAT CT GAAGTT GCT AT AT CAGAGGAA 3423 

I I I M I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I II I I I I II II I I I I I I I I I I 
Db 712 T CAGAT GAAG GC CAC C CAT T C AGG GCAT AT T T GGAAT CT GAAGT T G GC AT AT CAG AG GAA 771 

Qy 3424 T 3424 

1 

Db 772 T 772 



RESULT 10 

BU709106 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



778 bp mRNA linear 
-0-UI.rl NIH_BMAP_EW0 Mus musculus 
mRNA sequence. 



EST 15-JUL-2003 
cDNA clone 



Euteleostomi ; 



BU709106 

UI-M-EWO-caz-g-l 
IMAGE: 6419369 5' 
BU709106 

BU709106. 1 GI: 23642247 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 778) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. James Lin, Univeristy of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .778 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 6419369" 



/tissue_type="whole brain" 
/dev stage="embryo 15.5 dpc" 
/lab~host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_EWO" 

/note="Organ: brain; Vector: pYX-Asc; Site_l: EcoR I; 
Site_2 : Not I; The library was constructed according to 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured mRNa was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with an 
oligo-dT primer containing a Not I site. Double stranded 
cDNA was size selected according to mRNA size fraction, 
ligated with EcoR I adaptor, digested with Not I, and then 
cloned directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA 
tail, is GTGCGTGGAA. This library was created for the 
University of Iowa Mouse Brain Molecular Anatomy Project 
(BMAP) : 'Gene Discovery in the Developing Mouse Nervous 
System', supported by National Instututes of Mental Health 
(NIMH), Hemin Chin, Ph.D., program coordinator." 



ORIGIN 



Query Match 18.0%; Score 673.6; DB 13; Length 778; 

Best Local Similarity 93.9%; Pred. No. 4.3e-89; 

Matches 711; Conservative 0; Mismatches 45; Indels 1; Gaps 1; 

Qy 1753 GAT T CT GAGGC AGAT TAT GT T ACAAC AGAT AC CT T AT CAAAGGT GAC T GAGGCAGCAGT G 1812 

I I I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 GAT T CT GAGG C AGAT TAT GT C ACAAC AGAT AAT T TAT CAAAGGT GACT GAG GC AGTAGT G 61 

Qy 1813 T CAAAC AT G C CT GAAGGT CT GAC G C CAGAT T T AGT T C AGGAAG CAT GT GAAAGT GAACT G 1872 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I II I I I I I I 
Db 62 GCAACC AT G C CT GAAG GT CT AAC G C C AGAT TT AGT T C AG GAAGC AT GT GAAAGT GAACT G 121 

Qy 1873 AATGj^AGCCACAGGTACAAAGATTGCTTATGAAACAAAAGTGGACTTGGTCCAAACATCA 1932 

II I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 122 AACGAAGCCACAGGTACAAAGATTGCTTATGAT^ACAAAAGTGGACTTGGTCCAGACATCA 181 

Qy 1933 GAAGCT AT ACAAGAAT C ACT T T AC C C C ACAGCAC AGC T T T GCC C AT CAT T T GAGGAAGCT 1992 

I I I II I I I I I I I I I III I I I II I I I I I I I I I I I I I I I I I I II I I II I II I I I I II I I I 
Db 182 GAAG C TAT AC AAG AGT CAAT T T AC C C C ACAGCAC AGCT T T GC C CAT CAT T T GAG GAAGC T 241 

Qy 1993 GAAG CAACT C C GT CAC CAGT T TT GC CT GAT AT T GT TAT G GAAGC AC CAT TAAAT T CT CT C 2052 

I I II II I I I I I I I I I I I I I I I I I I II II II I II I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 2 42 GAAGCAACTCCGTCACCAGTTTTGCCTGATATTGTTATGGAAGCGCCATTAAATTCTCTC 301 

Qy 2053 CTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCT 2112 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II III II 
Db 3 02 CTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAGTGCATCCCCACTAGAAGTACCGTCT 3 61 

Qy 2113 C C AGTT AGT TAT GACAGT AT AAAGC T T GAGC CT GAAAAC C C CC C AC CAT AT GAAGAAGC C 2172 

I I I I II I II I I I I I I I I II I I I I I I I I I II I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 362 C C AGTT AGT TAT GAC GGT AT AAAGC T T GAGC CT GAAAAT C C C C CAC CAT AT GAAGAAGC C 421 

Qy 2173 AT GAAT GT AGC AC T AAAAGCT T T G G GAAC AAAGGAAG GAAT AAAAGAG C C T GAAAGTT T T 2232 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III I I I I I I I I I I I I I I I I I I 
Db 422 AT GAGT GT AGCAC T AAAAAC AT C GGAC T C AAAGGAAGAAAT T AAA.GAGC C T GAAAGT T T T 481 



/ 



Qy 22 33 AAT GC AG CT GT T C AGGAAACAGAAGCT C CT TAT AT AT C CAT T GC GT GT GAT T T AAT T AAA 2292 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I 1 I 

Db 482 AAT G CAGC T GCT C AG GAAG CAGAAG C T C CT T AT AT AT C CAT T G CAT GT GAT T TAATT AAA 541 

Qy 22 93 GAAAC AAAG C T C T C CACT GAGC CAAGT C CAGAT T T C T C TAATT AT T C AGAAAT AGCAAAA 2352 

II I I II I I I I I I I I I I I I I I I I I II I 1 I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 542 GAAACAAAGCT C T C CACT GAGC CAAGT C CAGAGT T CT CT AATT AT T CAGAAAT AGCAAAA 601 

Qy 2353 T T C GAGAAGT CGGTGCCC GAACAC GCT GAG CT AGT GGAGGATT C CT C AC CT GAAT CT GAA 2412 

II I II I I I I I I I I I I I II III I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 602 T T T GAGAAGT C G GT GC CT GAT C AC T GT GAGC T C GT G GAT GAT T C CT CAC C C GAAT CT GAA 661 

Qy 2413 C CAGT T GACT T AT T T AGT GAT GATT C GATT C CT GAAGT C C C ACAAACACAAGAG GAG GCT 2472 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I 
Db 662 C CAGT T GACT TAT T T AGT GAT GAT T CAAT T C C T GAAGT - C C ACANAC ACAAGAGGAGGCT 72 0 

Qy 247 3 GT GAT GCT CAT GAAGGAGAGTCT CACT GAAGTGTCTG 2509 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 721 GT GAT GC T AAT GAAG GAGAGT C T CACT GAAGT GT CT G 757 



RESULT 11 

CA320618 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CA320618 777 bp mRNA linear EST 09-JUL-2003 

UI-M-FW0-ccb-k-24-0-UI . rl NIH_BMAP_FW0 Mus musculus cDNA clone 
IMAGE: 6817393 5', mRNA sequence. 
CA320618 

CA320618 . 1 GI : 24 538742 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rocientia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 777) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / /image . llnl . gov 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .777 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 6817393" 
/tissue type="whole brain" 



/dev_stage="embryo 13.5,14.5,16.5,17. 5dpc" 
/lab_host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_FWO" 

/note="Organ: Brain; Vector: pYX- Asc; Site_l: EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is AGCGAGACAG. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System', supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 



ORIGIN 



Query Match 17.7%; Score 662.6; DB 14; Length 777; 

Best Local Similarity 92.1%; Pred. No. 1.8e-87; 

Matches 724; Conservative 0; Mismatches 50; Indels 12; Gaps 2; 

Qy 2147 AAAAC C C C C C AC CAT AT GAAGAAG C CAT GAAT GT AG CACT AAAAGCT T T G G GAACAAAG G 22 06 

I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1 AAAAT C C C C C AC CAT AT GAAGAAGC C AT GAGT GT AG CACTAAAAAC AT C GGACT CAAAGG 60 

Qy 2207 AAGGAAT AAAAGAG C CT GAAAGT T T T AAT GC AGCT GT T CAGGAAAC AGAAGCT C CT TATA 22 66 

ill III I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 61 AAGAAAT T AAAGAGC CT GAAAGT T T T AAT GC AGC T GC T CAGGAAGC AGAAGC T C CT TATA 120 

Qy 2267 TAT C CAT T G C GT GT GAT T TAAT T AAAGAAACAAAGCT CT C CACT GAGC CAAGT C CAGAT T 2 326 

I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 TAT C CAT T G CAT GT GAT T TAAT T AAAGAAACAAAG C T CT C CACT GAGC CAAGT C C AGAGT 180 

Qy 2327 T CT CT AAT TAT T CAGAAAT AGCAAAAT T C GAGAAGT C GGT G C C C GAAC AC GC T GAGCT AG 2386 

I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II III I I I I II I 
Db 181 TCTCTAATTATTCAGAAATAGCAAAATTT GAGAAGT CGGTGCCT GAT CACT GTGAGCTCG 24 0 

Qy 23 87 T GGAG GAT T C CT C AC CT GAAT C T GAAC C AGT T GACT T ATT T AGT GAT GAT T C GAT T C C T G 244 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 T GGAT GAT T C CT C AC C C GAAT C T GAAC C AGT T GACT TAT T T AGT GAT GAT T CAAT T C C T G 300 

Qy 2447 AAGT C CCACAAACACAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GAAGT GT 2506 

I I I II I I I I I I I I II I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 301 AAGT C C C AC AAAC AC AAGAG GAG GC T GT GAT GCT AAT GAAGGAGAGT CT CACT GAAGT GT 3 60 

Qy 2507 C T GAGAC AGT AGC C C AG C AC AAA GAGGAGAGACTTAGTGCCTCACCTCAGGAGCTAG 2563 

I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I III 

Db 361 CT GAGAC AGT AAC AC AAC ACAAAC AT AAG GAGAGACT T AGT GCT T CAC CT CAG GAG GT AG 420 

Qy 2564 GAAAGCCATATTTAGAGT CTTTT CAGCCCAATTTACATAGTACAAAAGAT GCTGCATCTA 2623 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I 
Db 421 GAAAG C CAT AT TT AGAGT CTTTT C AGC C CAAT T T AC AT AT T ACAAAAGAT GCT G C AT CT A 4 80 



QY 



2624 AT GAC AT T C CAAC AT T G AC CAAAAAGGAGAAAAT T T C T TT G C AAAT GGAAGAGT T TAAT A 2683 



Db 481 AT GAAAT T C CAAC AT T GAC CAAAAAG GAGACAAT T T C T T T GCAAAT G GAAGAGT TTAAT A 54 0 

Qy 2684 CT GCAATTTATT CAAAT GAT GACTTACTTT CTT CTAAGGAAGACAAAATAAAAGAAAGT G 2743 

I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I i I I || I I I i I I | I | | I I I I | | | M | 
Db 541 C T G C AAT T TAT T C C AAT GAT GAC T T AC T T T C T T CT AAGGAAGAC AAAAT GAAAGAAAGT G 600 

Qy 27 44 AAAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AG AT GAAT T T C C C AC G T T T GT C AGT G 2803 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I M I I I II 
Db 601 AAACATTTT CC GATT CAT CT CT CATT GAGATAATAGAT GAGTT T C C CACAT TT GTCAGT G 660 

Qy 2 804 C TAAAGAT GAT T CT C CT AAATT AGC CAAGGAGT ACAC T GAT CT AGAAGT AT C C GACAAAA 2863 

I M I I I I M I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I II I I 

Db 661 CT AAAGAT GAT T C T C C T AAGGAGT AC ACT GAC CT AGAAGT AT CCAACAAAA 711 

Qy 28 64 GT GAAAT T GCTAATAT C CAAAGCGGGGCAGATT CAT TGCCTTG CTT AGAATTGCCCTGTG 2923 

! I I I I I I I M I I I I I I M I I I I I I I I I I II I II I I II I I I I I I I I I I I I I I I 
Db 712 GTGAAATTGCTAATGTCCAGAGCGGNGGCAATTCGTTGCCTTGCTCAGAATTGCCCTGTG 771 

Qy 2924 ACCTTT 2929 

I I I I I I 

Db 772 ACCTTT 777 



RESULT 12 
CA32 0635 
LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



CA320635 802 bp mRNA linear EST 09-JUL-2003 

UI-M-FW0-ccb-o-24-0-UI . rl NIH_BMAP_FW0 Mus musculus cDNA clone 
IMAGE: 6817489 5', mRNA sequence. 
CA32 0635 

CA32 0635.1 GI: 24538759 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 802) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : / / image . llnl . gov 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .802 

/organism="Mus musculus" 

/mol_type="mRNA" 

/strain="C57BL/6" 



/db_xref="taxon: 10090" 
/clone="IMAGE: 6817489" 
/tissue_type="whole brain" 

/dev_stage="embryo 13.5,14.5,16.5,17. 5dpc" 
/lab_host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_FWO" 

/note="Organ: Brain; Vector: pYX- Asc; Site_l : EcoR I; 
Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is AG C GAG AC AG . This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System', supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 

ORIGIN 

Query Match 17.7%; Score 660.4; DB 14; Length 802; 

Best Local Similarity 91.6%; Pred. No. 3.8e-87; 

Matches 745; Conservative 0; Mismatches 54; Indels 14; Gaps 4; 

QY 2148 AAAC C C C C C AC CAT AT GAAGAAGC C AT GAAT GTAGC ACT AAAAGC T T T G GGAACAAAGGA 2207 

I M I I I M I I I I I II I I I I I I II I I I I I I I I I M I I I II I I I II I I I I I I I 
Db 1 AAAAT CC CCACCATAT GAAGAAGCCAT GAGT GTAGCACTAAAAACAT CGGACT CAAAGGA 60 

Qy 2208 AGGAATAAAAGAGC C T GAAAGT T T T AAT G CAGCT GT T C AGGAAACAGAAG CT C CT T AT AT 2267 

M I M I I I I I I I I I I I I I I I I I I I M I I || | | | I I | M | | I I I I I I I I I I I I I I I I 
Db 61 AGAAATTAAAGAGCCTGAAAGTTTTAATGCAGCTGCTCAGGAAGCAGAAGCTCCTTATAT 120 

Qy 2268 AT C CAT T GC GT GT GAT T T AAT T AAAGAAACAAAGCT CT CCACT GAGC CAAGT C C AGAT T T 2327 

M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I II I I I M 
Db 121 AT C CAT T G CAT GT GAT T TAAT T AAAGAAACAAAGCT CT CCACT GAG C CAAGT C CAGAGTT 180 

Qy 232 8 CTCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGT 2387 

I I I I I I I M I II I I I I I I I I I I I I I I I I | | I | | | | | | | M | || IN | | | | | | || 
Db 181 C T C TAAT TAT T CAGAAAT AGCAAAAT T T GAGAAGT C GGT GC C T GAT CACT GT GAG C T C GT 24 0 

Qy 2388 G GAGGAT T C C T CACC T GAAT C T GAAC C AGT T GACT T AT T TAGT GAT GAT T C GAT T C CT GA 2447 

Ml I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 241 GGAT GAT T C CT CACC C GAAT CT GAAC C AGTT GACT TAT T TAGT GAT GAT T CAAT T C CT GA 300 

Qy 2448 AGT CCCACAAACACAAGAGGAGGCTGT GAT GCT CAT GAAGGAGAGTCT CACT GAAGTGTC 2507 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I II II I I I I 
Db 301 AGT C CCACAAACACAAGAGGAGGCT GT GATGCTAAT GAAGGAGAGT CT CACT GAAGTGTC 360 

Qy 2508 T GAGAC AGT AG C C C AG CACAAA GAG GAGAGAC T TAGT GC CT CAC C T C AG GAG CT AGG 2564 

M I I I I II I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I I 

Db 361 T GAGACAGTAACACAACACAAACATAAGGAGAGACTTAGT GCTT CACCT CAGGAGGT AGG 420 



Qy 2565 AAAGC CAT AT T TAGAGT C T T T T C AG C C CAAT T T AC ATAGT ACAAAAGAT GC T GC AT C T AA 2624 

M I I I I II I I I I I I I I I I M M I I I I I I I | | | | | M I I I I I I I I I I I I I I II I I I I I I I 



Db 



421 AAAG C CAT AT T T AGAGT CT T T T C AG C C CAATT T ACAT AT T ACAAAAGAT GC T G CAT C T AA 48 0 



Qy 2625 TGACATTCCAACATTGACCAAAAAGGAGAAAATTTCTTTGCAAATGGAAGAGTTTAATAC 268 4 

IN I I I I I I I I I I I I I I I I I I I I I I II I || | | I I I | | | | | | | | | | | | | | | | | | | | | | | 
Db 4 81 T GAAAT T C CAACAT T GAC C AAAAAG GAGACAAT T T C T T T GCAAAT GGAAGAGT T TAAT AC 540 

Qy 2 68.5 TGCAATTTATTCAAATGATGACTTACTTTCTTCTAAGGAAGACAAAATAAAAGAAAGTGA 2744 

I I I I M M I II I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 541 T G CAATT TAT T C CAAT GAT GACT T ACT T T C T T CTAAGGAAGAC AAAAT GAAAGAAAGT G A 600 

Qy 2745 AAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AGAT GAAT T T C C C AC GT T T GT C AGT G C 2804 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I III I II I I I I I I I I 
Db 601 AAC AT T T T C C GAT T CAT C T CN C AT T GAGAT AAT AGAT GAGT T T CN C AC AT T T GT CAGT GC 660 

Qy 2 8 05 TAAAGAT GATT CT C CTAAAT T AGC CAAGGAGT ACAC T GAT CT AGAAGT AT C C GAC AAAAG 2864 

I I M I I I I I I I I I I M I I I I I I I I I I I II I I I I I I II II II I Ml Ml 

Db 661 TAAAGAT GATT CT C CT AAGGAGTACACT GACCTAGAAGT AT CCAACACAAG 711 

Qy 2865 T GAAAT T GC TAAT AT C CAAAGC GG GG CAGAT T C ATT GC CT T GCTT AGAAT TGCCCTGT GA 2 924 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I | | | | | | I III I I I M I I 
Db 712 TGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTCAGATTTG-CCTGTGA 770 

Qy 2925 CCTTTCTTT C AAGAAT AT AT AT C C TAAAGAT GA 2957 

I I I I I I I I I I I I Ml I I I M I I I I I I I I I I 
Db 771 CCTTTCTTT CAN G- AT AC AT AT C C TAAAGAT GA 802 



RESULT 13 

BQ892001 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BQ892001 951 bp mRNA linear EST 16-AUG-2002 

AGENCOURT_8758347 NIH_MGC_129 Mus musculus cDNA clone IMAGE : 6315079 
5 1 , mRNA sequence. 
BQ892001 

BQ892 001. 1 GI : 222 84 015 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 951) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 

Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Susan L. Sullivan, PhD. 
cDNA Library Preparation: ResGen, Invitrogen Corp 
cDNA Library Arrayed by: The I.M.A.G.E. Consortium (LLNL) 
DNA Sequencing by: Agencourt Bioscience Corporation 
Clone distribution: MGC clone distribution information can be 

found through the I.M.A.G.E. Consortium/LLNL at: 

http : //image . llnl . gov 

Plate: LLAM13744 row: n column: 08 

High quality sequence start: 6 

High quality sequence stop: 629. 
Location/Qualifiers 
1. .951 



/organism= M Mus mus cuius" 
/mol_type="mRNA" 
/db_xref="taxon: 10090" 
/clone="IMAGE: 6315079" 
/lab_host="DH10B (phage-resistant ) " 
/clone_lib="NIH_MGC_129" 

/note="Organ: olfactory epithelium; Vector: pCMV-SPORT6 . 1 ; 
Site_l: EcoRV; Site_2 : NotI; Cloned unidirectionally . 
Primer: Oligo dT. Average insert size 2.2 kb. Constructed 
by ResGen, Invitrogen Corp. Note: this is a NIH_MGC 
Library . " 

ORIGIN 



Query Match 17.6%; Score 659.2; DB 13; Length 951; 

Best Local Similarity 89.3%; Pred. No. 5.7e-87; 

Matches 780; Conservative 0; Mismatches 79; Indels 14; Gaps 6; 

Qy 264 6 AAAG GAGAAAAT T T CT T T G CAAAT GGAAGAGT T T AAT ACT GCAATT T AT T CAAAT GAT GA 27 05 

HI I I I M I I I I I M I I I I II I I I I I I I I I I I I I | | | | | | | | | | | | | MINIM 
Db 11 AAAAGGAGAAATTTCTTTGCAAATGGAAGAGTTTAATACTGCAATTTATTCCAATGATGA 7 0 

Qy 2706 CT T AC T T T C TT CT AAGGAAGACAAAAT AAAAGAAAGT GAAACAT T T T C AGAT T CAT CT C C 27 65 

I I I I I I I M I I I I I M I I M I I I I I M I I I II M I I I I I II I MM II 

Db 71 CT T AC T T T C TT CT AAGGAAGACAAAAT GAAAGAAAGT GAAACAT TTT C C GAT T CAT CT C C 130 

Qy 2766 GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT CAGT G CTAAAGAT GAT T CT C C TAAAT T 2825 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I | | | | | M | | | | | | | | | | M I II 
Db 131 CAT T GAGAT AAT AGAT GAGT T T C C C ACAT T T GT C AGT G CTAAAGAT GAT T CT C CT 185 

Qy 2826 AGC CAAG GAGT AC ACT GAT CT AGAAGT AT C C GACAAAAGT GAAATT GCT AAT AT CCAAAG 28 85 

I I I I I I I I II I I II I I I I I I I I I I I I M I I I I II M I I I I I I I I I I I II I M 
Db 18 6 AAGGAGT AC ACT GAC CT AGAAGT AT C CAACAAAAGT GAAAT T GCT AAT GT CCAGAG 241 

Qy 2 8 86 CGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAATATATA 2945 

I I I I I I I I I I I I I I II I I I I I I I I I I I I || I I I I II I I I I I I I I I I I | M M III 
Db 242 CGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGACCTTTCTTTCAAGAATACATA 301 

Qy 2946 T C CTAAAGAT GAAGT AC AT GTT T C AGAT GAAT T C T C C GAAAAT AGGT C CAGT GT AT CTAA 3005 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I III I I I I I I I I I | | || | | | || 

Db 302 T C CTAAAGAT GAAGC AC AT GT CT C AGAT GAAT T CT C CAAAAGT AGGT CCAGT GT AT CTAA 361 

Qy 3006 GGCATCCATATCGCCTT CAAAT GTCTCTGCTTTGGAACCTCAGACAGAAATGGGCAGCAT 3065 

II M IN II I I II II I I I I I I II I I I I II I I I I I I I I II I I I I I I III 

D b 362 GGTGCCCTTATTGCTTC CAAAT GTTTCTGCTTTGGAATCTCAAATAGAAATGGGCAAC AT 421 

Qy 3066 AGT TAAAT C CAAAT CAC T T AC GAAAGAAGCAGAGAAAAAAC TTCCTTCT GACAC AGAGAA 3125 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I II I I II I I I I I II I 
Db 422 AGT T AAAC C CAAAGTACT T AC GAAAGAAG CAGAGGAAAAACT T C CT T CT GAT AC AGAGAA 4 81 

Qy 312 6 AGAG GAC AGAT C C CT GT C AGC T GT AT T GT CAGCAGAG CT GAGT AAAAC T T CAGT T GTT GA 318 5 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 482 AGAGGAC AGAT C C CT G AC AG C T GT AT T GT C AGCAGAGC T GAAT AAAAC T T C AGT T GT T GA 541 

Qy 3186 CCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCT 324 5 

I I I I I I I I I I I I I I | | | | | | | | | | | | || I I I I I I I I I I I I I I I I I || I I I | || || I I II 
Db 5 42 C CT C C T GT ACT G GAGAGAC AT T AAGAAGACT GGAGT G GT GT T T G GT G C C AGC T TAT T C CT 601 



Qy 3246 GCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTG-CCTTGGCCC 3304 

I I I I M I I I I M I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I II II 
Db 602 GCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCCTTGGCCC 661 

Qy 3305 T G CT CT C GGT GACT AT CAG C T T T AGGAT AT AT AAGG GC GT GAT C C AGG CT AT C C AGAAAT 3364 

I I I I I I I I I I I I I I I I I I I I I | M I II I I I I MINIM II I II I I M II II 

Db 662 TGCTCTCTGT GACT AT N CAG C T T AGGAT AT ATAAG G GT GT GAT C CAAGCT AT C C AGAAAT 721 

Qy 3365 C AGAT GAAG G C CA- C C CAT T CAGGGC AT AT T TAGAAT C T GAAGT T G CT AT AT CAGAGGAA 3423 

I M I II I II II II III I I II I II I II I I || I I II I I II I II | | || | || | M Ml 
Db 722 C AGAT GAAG GC C AC C C C T T T CAG G GCAT AT T T G GGAT C T GAAGT T GC CAT AT CAGAAGAA 781 

Qy 3424 T T GG- T T CAGAAAT AC A- GT AAT TCTGCTCTTGGT CAT GT GAACAG- C ACAAT AAAAGAA 3480 

M I I I I I I I I II II I I I I I I II I I II M I I MM I II I II II II II I II II 
Db 782 T T G GT T T CAGAAAT AT AGGAAAT TCTGCTCTT GGGC AT GGG GAC C G C C ACAAT AAAAGAA 841 

Qy 34 81 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTT 3513 

II I II I I II I I II III 

Db 842 ATGGAGGGGGTCTCCTCCTAAGTTAATGGATTT 874 



RESULT 14 

CF726835 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



EST 09-OCT-2003 



FEATURES 

source 



CF726835 767 bp mRNA linear 

UI-M-HB0-cki-m-06-0-UI.rl NIH_BMAP_HB0 Mus musculus cDNA clone 
IMAGE: 30548549 5 ! , mRNA sequence. 
CF726835 

CF726835. 1 GI : 37601003 
EST . 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 767) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email : cgapbs-r@mail . nih . gov 

Tissue Procurement: Dr. James Lin University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Distribution information can be found at 

http : / / genome. uiowa . edu/ distribution/mousef 1 . html 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP } 

Seq primer: pYX-5. 

Location/Qualifiers 
1. .767 

/organism="Mus musculus" 

/mol__type= ,, mRNA M 

/strain="C57BL/6" 

/db_xref="taxon: 10090" 

/clone=" IMAGE : 3054 854 9 " 

/tissue_type="whole eye" 

/dev_stage="embryo 12.5,13.5,14.5 dpc" 



/lab_hos't="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_HB0" 

/note="Organ: Eye; Vector: pYX- Asc; Site_l : EcoR I; 
Site_2 : Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site .Double strand cDNA was 
size selected according to mRNA size fraction , ligated 
with EcoR I adaptor , digested with NotI and then cloned 
directionally into pYX-Asc vector . The library tag 
sequence located between the Not I site and the polyA tail 
is TTATTGAAGT. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System 1 , supported by National 
Institute of Mental Health (NIMH) . 11 



ORIGIN 



Query Match 17.4%; Score 651.2; DB 14; Length 767; 

Best Local Similarity 93.4%; Pred. No. 8.5e-86; 

Matches 718; Conservative 0; Mismatches 34; Indels 17; Gaps 3; 

ATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAA 829 
I I I M I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I II I I I I I I I I | | M I II 
ATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAG-AA 59 

AAAT TAT GGATT T GAT G GAG CAGC C AGGT AACACT GT TT C GT CT GGT CAAGAGGAT TT CC 889 
I I M M I I I I I I I I I I II I I II I I I I I I I II I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I 
AAAT TAT G GATT T GAAGGAGC AG C GAG GTAACAC TGTTTCGTCTGGT CAAGAGGAT TT C C 119 

CATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTT 949 
M M I M I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I 
CATCTGTCCTGTTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTT 179 

CT T T TAAAGAACAT GGAT AC CT T GGT AACT T AT CAGCAGT GT CAT C CT C AGAAG GAACAA 1009 

I I I I II I I I I I I I I I I II I I II I I I II I I I I I I I I M I I I I I I I I I I I I I I M I I I 

C T T T T AAAGAAC AC GGAT AC CT T G GT AAC T TAT C AG C AGT GGCAT C C AC AGAAGGAACT A 239 

T T GAAGAAACT T T AAAT GAAG CT T CTAAAGAGT T GC CAGAGAGGG CAACAAAT C CAT T T G 1069 
I I I I I I I I I I I I I I I I I II I I I I I I I I III I I I I I I I I I I I I I I I I I I I I || || I || I 
T T GAAGAAACT T T AAAT GAAGC T T C T AGAGAAT T G C C AGAGAGGGCAACAAAT C CAT T T G 299 

T AAAT AGAGAT T T AGC AGAAT T T T C AGAAT T AGAAT AT T CAGAAAT GGGAT CAT CTT TT A 1129 
I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I || I I I I II I 
T AAAT AGAGAGT C AGC AGAGT T T T C AGT AT T AGAAT AC T CAGAAAT GG GAT CAT CT T T C A 359 

AAGGCT C C CCAAAAGGAGAGT CAGC C AT ATT AGT AGAAAACACT AAGGAAGAAGTAATT G 1189 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I | I | || I I I I 
AT GGCT C C C CAAAAG GAGAGT C AG C CAT GT T AGT AGAAAAC ACTAAG GAAGAAGT AAT T G 419 

T GAG GAGT AAAGACAAAGAG GAT T T AGT T T GT AGT G CAG C C CT T CAC AGT C C ACAAGAAT 124 9 
I I M I II I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I | | | | | | M I II II I 
T GAGGAGTAAAGACAAAGAGGAT T T AGT T T GT AGT GCAGC C CTT CAT AAT C C ACAAGAGT 479 

CACCT GT G G GTAAAGAAGAC AGAGT T GT GT CT C C AGAAAAGACAA 1294 

I I I I I I I I I I I I I I I I I M I I I I I II I I I I II I I I I I I I I I I 

CAC C T GC GAC C CT TACTAAAGT GGT TAAAGAAGAC G GAGT TAT GT CT CC AGAAAAGACAA 539 



Qy 


770 


Db 


1 


Qy 


830 


Db 


60 


Qy 


890 


Db 


120 


Qy 


950 


Db 


180 


Qy 


1010 


Db 


240 


Qy 


1070 


Db 


300 


Qy 


1130 


Db 


360 


Qy 


1190 


Db 


420 


Qy 


1250 


Db 


480 



Qy 12 95 T GGACAT T T T T AAT GAAAT GC AGAT GT C AGT AGT AGC AC C T GT GAG GGAAGAGT AT GCAG 1354 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 540 T GGACAT T T T TAAT GAAAT GAAAAT GT C AGT GGT AGCAC C T GT GAG GGAAGAGT AT GCAG 599 

Qy 1355 ACTTTAAGCCAT TT GAACAAGCAT GGGAAGT GAAAGAT ACTT AT GAGGGAAGTAGGGATG 1414 

I I I I I I I I M II I I I I I I M II I I I I I I I I I II I I I II I I I I I I I I I || | | | | | I | | | | 
Db 600 AT TT TAAGC CAT T T GAACAAG CAT GG GAAGT GAAAGAT AC T TAT GAG GGAAGT AGGGAT G 659 

Qy 1415 TGCTGGCTGCTAGAGCTAATGTGGAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCC 14 74 

I I I I M I I I I I M I I I I I I I I I I M II I I I I II I I I I I I I I I I II I I II I I I I I I I I 
Db 660 T GC T GGCT G C T AGAGC T AATAT G GAAAGTAAAGT GGAC ACAAAAT GCT T T GAAGAT AGCC 719 

Qy 14 75 T GGAGCAAAAAAGT CTT GGGAAGGATAGT GAAGGCAGAAAT GAGGAT GC 1523 

I I I I I I Ml III I I I I II I I I I I I II I I I I I I II I I I I I I I I I 
Db 720 T G GAGC - NAAAG GT CAT GG GAAGGAT AGT GAAAGC AGAAAT GAGAAT GC 767 



RESULT 15 

BU612951 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

COMMENT 



FEATURES 

source 



BU612951 739 bp mRNA linear EST 20-FEB-2003 

UI-M-FR0-cbd-a-04-0-UI.rl N I H_BMAP_FR 0 Mus musculus cDNA clone 
UI-M-FR0-cbd-a-04-0-UI 5\ mRNA sequence. 
BU612951 

BU612 951. 1 GI: 232 79166 
EST. 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
1 (bases 1 to 739) 
NIH-MGC http://mgc.nci.nih.gov/. 

National Institutes of Health, Mammalian Gene Collection (MGC) 
Unpublished (1999) 

Contact: Robert Strausberg, Ph.D. 
Email: cgapbs-r@mail.nih.gov 

Tissue Procurement: Dr. Jim Lin, University of Iowa 
cDNA Library preparation: Dr. M. Bento Soares, University of Iowa 
cDNA Library Arrayed by: Dr. M. Bento Soares, University of Iowa 
DNA Sequencing by: Dr. M. Bento Soares, University of Iowa 
Clone Distribution: Clone distribution information can be obtained 

from Dr. M. Bento Soares, bento-soares@uiowa.edu 
This clone was contributed by the Brain Molecular Anatomy Project 

(BMAP) 

Seq primer: pYX-5. 

Location/ Qualifiers 
1. .739 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="C57BL/6" 
/db_xref= n taxon: 10090" 
/clone="UI-M-FR0-cbd-a-04-0-UI" 
/tissue_type="whole brain" 

/dev__stage="embryo 13 . 5 , 14 . 5 , 16 . 5, 17.5dpc" 
/lab_host="DH10B (Tl phage resistant)" 
/clone_lib="NIH_BMAP_FRO" 

/note="0rgan: Brain; Vector: pYX- Asc; Site 1: EcoR I; 



Site_2: Not I; The library was constructed according 
Bonaldo, Lennon and Soares, Genome Research, 6:791-806, 
1996. Denatured RNA was size fractionated on a 1% agarose 
gel. First strand cDNA synthesis was primed with oligo-dT 
primer containing a Not I site. Double strand cDNA was 
size selected according to mRNA size fraction, ligated 
with EcoR I adaptor, digested with NotI and then cloned 
directionally into pYX-Asc vector. The library tag 
sequence located between the Not I site and the polyA tail 
is AGCGAGACAG. This library was created for the University 
Iowa Brain Anatomy Project (BMAP) : 'Gene Discovery in the 
Developing Mouse Nervous System 1 , supported by National 
Institute of Mental Health (NIMH), Hemin Chin, Ph.D., 
program coordinator." 

ORIGIN 

Query Match 17.3%; Score 648.6; DB 13; Length 739; 

Best Local Similarity 92.9%; Pred. No. 2e-85; 

Matches 694; Conservative 0; Mismatches 44; Indels 9; Gaps 1; 

Qy 2721 G G AAG AC AAAAT AAAAGAAAGT GAAAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AG A 2780 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I M I I I I I I I I I I I I I I 
Do 2 GGAAGACAAAAT GAAAGAAAGT GAAACATTTT C CGATT CAT CT C CCATT GAGATAATAGA 61 

Qy 2781 T GAATTT C C CAC GT T T GT C AGT GCT AAAGAT GAT T CT C CT AAAT T AGC CAAG GAGT AC AC 2 840 

Ml I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I | I I I I I II I I I I I I I 

Db 62 T GAGTT T C C CAC AT T T GT C AGT GC TAAAGAT GAT T CT C CT AAGGAGTACAC 112 

Qy 2 841 T GAT CT AGAAGT AT C CGACAAAAGT GAAAT T G C TAAT AT C CAAAGC GG G GC AGAT T CAT T 2 900 

Ml I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II 
Db 113 TGACCTAGAAGTATCCAACAAAAGTGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTT 172 

Qy 2901 GC CT T GCT T AGAAT T GC C C T GT GAC CTTTCTTT CAAGAATAT AT AT C C TAAAGAT GAAGT 2 960 

I M M I I I I I I I I I I I I II I I I I I II I I I II I I I I I I I I I I I I II I I I I I I I I I I || 
Db 173 GCCTTGCT CAGAAT T GC C CT GT GAC CT T T CT T T CAAGAAT ACAT AT C C TAAAGAT GAAGC 232 

Qy 2961 ACAT GT T T C AGAT GAAT T CT C C GAAAAT AG GT C C AGT GT AT CT AAG GC AT C CAT AT C G C C 3020 

I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I II III II 
Db 233 ACATGTCTCAGATGAATTCTCCAAAAGTAGGTCCAGTGTATCTAAGGTGCCCTTATTGCT 292 

Qy 3021 T T CAAAT GTCTCTGCTTT GGAAC C T C AGAC AGAAAT G GGC AGC AT AGT T AAAT C CAAAT C 3080 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 93 TC CAAAT GTTTCTGCTTTGGAATCT CAAAT AGAAAT GGGCAACATAGTTAAACCCAAAGT 352 

Qy 3 081 ACTTACGAAAGAAGCAGAGAAAAAACTT CCTT CT G AC AC AG AG AAAG AG GAC AG AT C CCT 3140 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I II 
Db 353 ACTTACGAAAGAAGCAGAGGAAAAACTT CCT T CT GATACAGAGAAAGAGGACAGAT CC CT 412 

Qy 3141 GTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTCTACTGGAG 3200 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 413 GACAGCTGT ATT GTCAGCAGAGCT GAAT AAAACTTCAGTTGTTGACCT CCT GTACTGGAG 4 72 

Qy 3201 AGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGAC 32 60 

M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 1 I I I I I I I 
Db 473 AGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGAC 532 

QY 32 61 AGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTAT 3320 



Db 533 AGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTAT 592 

Qy 3321 C AGCT T TAG GAT AT ATAAG G GC GT GAT C C AGGC T AT C C AGAAAT C AGAT GAAG G C CAC C C 338 0 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 5 93 C AG C T T T AGGAT AT AT AAGG GT GT GAT C CAAGCT AT C C AGAAAT CAGAT GAAGGC CACC C 652 

Qy 3381 AT T C AG GGC AT AT T T AGAAT CT GAAGT T GCT AT AT C AGAGGAAT T G GT T C AGAAAT AC AG 3440 

I M I I I I I I II I I I I I II II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 
Db 653 AT T C AGGG CATAT T T GGAAT C T GAAGT T GC CAT AT C AGAG GAAT T GGT T CAGAAATAT AG 712 

Qy 3441 TAATTCTGCTCTTGGTCATGTGAACAG 3467 

I I I I I I I I I II I II I I I I I I I I I I I I I 
Db 713 TAATTCTGCTCTTGGTCATGTGAACAG 739 



Search completed: September 11, 2004, 15:09:45 
Job time : 8884.33 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 
Run on: 



Title: 
Perfect score: 3741 



September 11, 2004, 00:39:35 ; Search time 14076.4 Seconds 

(without alignments) 

11519.006 Million cell updates/sec 

US-09-830-972-1 



Sequence : 
Scoring table: 

Searched : 



1 attgctcgtctgggcggcgg gattgaagcgcaaagcagat 3741 

IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

3470272 seqs, 21671516995 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



6940544 



Database 



GenEmbl : * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 



gb_ba : * 
gb_htg : * 
gb_in : * 
gb_om : * 
gb_ov: * 
gb__pat : * 
gb_ph : * 
gb_pl : * 
gb_pr : * 
gb_ro : * 
gb_sts : * 
gb_s y : * 
gb un : * 
gb_vi : * 
em_ba : * 
em_f un : * 
em_hum : * 
em__in: * 
em__mu : * 
em_om : * 
em_or: * 
em_o v : * 
emjpat : * 
em_ph : * 
em_pl : * 
em_ro : * 
em sts:* 



28 


: em 


un : * 


29 


: em 


vi : * 


30 


: em 


htg hum:* 


31 


: em 


htg inv:* 


32 


: em 


htg other:* 


33 


: em 


htg mus:* 


34 


em 


htg_pln: * 


35 


em 


htg_rocl: * 


36 


em 


htg mam:* 


37 


em 


htg vrt:* 


38 


em 


sy : * 


39 


em 


htgo hum:* 


40 


em 


htgo mus : * 


41: 


em 


htgo other:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 



lo. 


Score 


Match 


Length 


DB 


ID 


Description 


1 


3739.4 


100. 


0 


4684 


10 


RN0242961 


AJ242961 Rattus no 


2 


3489 


93. 


3 


3489 


6 


AX766046 


AX766046 Sequence 


3 


3202.4 


85. 


6 


4627 


10 


AY102284 


AY102284 Mus muscu 


4 


3200. 4 


85. 


5 


3821 


10 


AY114152 


AY114152 Mus muscu 


5 


3140. 4 


83. 


9 


4518 


10 


BC056373 


BC056373 Mus muscu 


6 


2651 


70. 


9 


4063 


10 


AY102280 


AY102280 Mus muscu 


7 


2543. 6 


68. 


0 


3815 


10 


BC032272 


BC032272 Mus muscu 


8 


2391.4 


63. 


9 


4166 


9 


AB040462 


AB040462 Homo sapi 


9 


2391. 4 


63. 


9 


4789 


9 


AY102279 


AY102279 Homo sapi 


10 


2353. 8 


62. 


9 


218532 


2 


AC131431 


AC131431 Rattus no 


11 


2353.8 


62. 


9 


238341 


2 


AC133315 


AC133315 Rattus no 


12 


2343.6 


62. 


6 


4053 


6 


AX195249 


AX195249 Sequence 


13 


2343.6 


62. 


6 


4053 


9 


AB020693 


AB020693 Homo sapi 


14 


2343.6 


62. 


6 


4632 


9 


AF148537 


AF148537 Homo sapi 


15 


2333.2 


62. 


4 


4093 


6 


BD270070 


BD270070 Secreted 


16 


2323. 8 


62. 


1 


4822 


6 


AR220865 


AR220865 Sequence 


17 


2289.2 


61. 


2 


3576 


6 


AX766050 


AX766050 Sequence 


18 


2289.2 


61. 


2 


3579 


6 


BD249446 


BD249446 Protein s 


19 


2289.2 


61 


2 


3579 


9 


HSA251383 


AJ251383 Homo sapi 


20 


2062 


55 


1 


60615 


10 


AY102286 


AY102286 Mus muscu 


21 


2062 


55 


1 


166516 


2 


AC135510 


AC135510 Mus muscu 


22 


2062 


55 


1 


211357 


2 


AC113284 


AC113284 Mus muscu 


23 


2062 


55 


1 


212042 


10 


AL929371 


AL929371 Mouse DNA 


24 


2011 


53 


8 


4102 


9 


AY123245 


AY123245 Homo sapi 


25 


2009. 8 


53 


7 


4123 


9 


AY123247 


AY123247 Homo sapi 


26 


2009. 6 


53 


7 


4070 


9 


AY123249 


AY123249 Homo sapi 


27 


2008.6 


53 


7 


3491 


9 


AF333336 


AF333336 Homo sapi 


28 


2008.6 


53 


7 


4109 


9 


AY123248 


AY123248 Homo sapi 


29 


2007.6 


53 


7 


4060 


9 


AY123250 


AY123250 Homo sapi 


30 


2007.6 


53 


7 


4160 


9 


AY123246 


AY123246 Homo sapi 


31 


1806.4 


48 


3 


2958 


10 


BC032192 


BC032192 Mus muscu 


32 


1769.4 


47 


3 


2883 


9 


AF320999 


AF320999 Homo sapi 


33 


1541.2 


41 


2 


1738 


10 


AB073672 


AB073672 Mus muscu 



34 


1486. 6 


39 . 


7 


2481 


9 


AF063601 


AF063601 Homo sapi 


35 


1468 


39 . 


2 


39674 


9 


AC092461 


AC092461 Homo sapi 


36 


1468 


39. 


2 


90756 


9 


AY102285 


AY102285 Homo sapi 


37 


1468 


39. 


2 


162692 


2 


AC016171 


AC016171 Homo sapi 


38 


1411 . 2 


37 . 


7 


2386 


6 


AX099401 


AX099401 Sequence 


39 


1411 . 2 


37 . 


7 


2386 


6 


BD190738 


BD190738 Secreted 


40 


1088 . 8 


29. 


1 


1980 


6 


BD083733 


BD083733 Nucleic a 


41 


1088.8 


29. 


1 


1980 


6 


BD097380 


BD097380 Nucleic a 


42 


864.6 


23. 


1 


1631 


9 


AK098385 


AK098385 Homo sapi 


43 


809.8 


21. 


6 


2782 


6 


AX700396 


AX700396 Sequence 


44 


809. 8 


21. 


6 


2782 


10 


AF132045 


AF132045 Rattus no 


45 


809. 8 


21. 


6 


2782 


10 


AY164741 


AY164741 Rattus no 



ALIGNMENTS 



RESULT 1 

RN0242961 

LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 

TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



FEATURES 

source 



gene 
CDS 



RN0242961 4684 bp mRNA linear ROD 28-JAN-2000 

Rattus norvegicus mRNA for Nogo-A protein. 

AJ242961 

AJ2429 61.1 GI: 682224 6 
Nogo-A protein. 

Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 
1 

Chen, M.S., Huber,A.B., van der Haar,M.E., Frank, M. , Schnell,L., 
Spillmann, A. A. , Christ, F. and Schwab, M.E. 

Nogo-A is a myelin-associated neurite outgrowth inhibitor and an 

antigen for monoclonal antibody IN-1 

Nature 403 (6768), 434-439 (2000) 

20129258 

10667796 

2 (bases 1 to 4684) 
Van der Haar,M.E. 

Direct Submission 

Submitted ( 14- JUN-1999 ) Van der Haar M.E., Department of 
Neuromorphology, Brain Research Institute, University of Zurich, 
Winterthurerstrasse 190, Zurich, CH-8057, SWITZERLAND 

Location/Qualifiers 

1. .4684 

/organisnv="Rattus norvegicus" 

/mol_type="mRNA" 

/db_xref="taxon: 10116" 

1. .4684 

/gene="nogo-A" 

253. .3744 

/gene="nogo-A" 

/function="Neurite outgrowth inhibition" 
/codon_start=l 
/product="Nogo-A protein" 
/protein_id="CAB71027 . 1" 
/db xref="GI: 6822247" 



/db_xref= M G0A:Q9JKll" 
/db_xr e f = " SWI SS- PROT : Q9 JK1 1 " 

/ trans la tion="MEDIDQSSLVSSSTDSPPRPPPAFKYQFVTEPEDEEDEEEEEDE 
EEDDEDLEELEVLERKPAAGLSAAAVPPAAAAPLLDFSSDSVPPAPRGPLPAAPPAAP 
ERQPSWERSPAAPAPSLPPAAAVLPSKLPEDDEPPARPPPPPPAGASPLAEPAAPPST 
PAAPKRRGSGSVDETLFALPAASEPVIPSSAEKIMDLMEQPGNTVSSGQEDFPSVLLE 
TAASLPSLSPLSTVSFKEHGYLGNLSAVSSSEGTIEETLNEASKELPERATNPFVNRD 
LAEFSELEYSEMGSSFKGSPKGESAI LVENTKEEVIVRSKDKEDLVCSAALHSPQESP 
VGKEDRWSPEKTMDIFNEMQMSWAPVREEYADFKPFEQAWEVKDTYEGSRDVLAAR 
ANVESKVDRKCLEDSLEQKSLGKDSEGRNEDASFPSTPEPVKDSSRAYITCASFTSAT 
ESTTANTFPLLEDHTSENKTDEKKIEERKAQIITEKTSPKTSNPFLVAVQDSEADYVT 
TDTLSKVTEAAVSNMPEGLTPDLVQEACESELNEATGTKIAYETKVDLVQTSEAIQES 
LYPTAQLCPSFEEAEATPSPVLPDIA/MEAPLNSLLPSAGASVVQPSVSPLEAPPPVSY 
DSIKLEPENPPPYEEAMNVALKALGTKEGIKEPESFNAAVQETEAPYISIACDLIKET 
KLSTEPSPDFSNYSEIAKFEKSVPEHAELVEDSSPESEPVDLFSDDSIPEVPQTQEEA 
VMLMKESLTEVSETVAQHKEERLSASPQELGKPYLESFQPNLHSTKDAASNDIPTLTK 
KEKISLQMEEFNTAIYSNDDLLSSKEDKIKESETFSDSSPIEIIDEFPTFVSAKDDSP 
KLAKEYTDLEVSDKSEIANIQSGADSLPCLELPCDLSFKNIYPKDEVHVSDEFSENRS 
SVSKASISPSNVSALEPQTEMGSIVKSKSLTKEAEKKLPSDTEKEDRSLSAVLSAELS 
KTSWDLLYWRDIKKTGWFGASLFLLLSLTVFSIVSVTAYIALALLSVTISFRIYKG 
VIQAIQKSDEGHPFRAYLESEVAISEELVQKYSNSALGHVNSTIKELRRLFLVDDLVD 
SLKFAVLMWFTWGALFNGLTLLIIALISLFSIPVIYERHQVQIDHYLGLANKSVKD 
AMAKI QAKI PGLKRKAD " 



ORIGIN 



Query Match 100.0%; Score 3739.4; DB 10; Length 4684; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3740; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

Qy 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I I I I I I I I I II I I I I I II I I I I 
Db 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

Qy 61 ATCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 ATCGCGAAGGCAGGAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 12 0 

Qy 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I 1 I I I I I II I I I 
Db 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

Qy 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 24 0 

I II I I I I I I II I I I I I II I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 24 0 

Qy 241 GAC C C G C C AG C CAT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTCGTC CAC G GAC AG C 300 

II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 241 GAC C C G C C AG C CAT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTCGTC CAC G GAC AG C 300 

Qy 301 CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 

I I II I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I 
Db 301 CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 



Qy 

Db 



361 GAC GAG GAG GAG GAG GAG GAC GAGGAGGAGGAC GAC GAG GAC CT AGAG GAAC T GGAGGT G 420 

M I II I I I I I I I I I I I I I I I I | I I I I I I I I I I I I II I I I I I I I || M | I M I I I I I II I I 
361 GAC GAG GAGG AG GAGGAG GAC GAGGAGGAGGAC GAC GAG GAC CT AGAG GAAC T GGAGGT G 420 



Qy 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 480 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 48 0 

Qy 4 81 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M 
Db 481 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 540 

Qy 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I II I I I I I I I II I 
Db 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

Qy 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

I II I I I I I I I I I I I I I I I I I II I I I I I I II I II II I I I I I I I I I I II I I I I I I || I | I I I 
Db 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

Qy 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 720 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 720 

Qy 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 78 0 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I 
Db 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 78 0 

Qy 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

I I I I II I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 840 

Qy 841 TT GAT G GAGC AG C C AG GTAACACT GT TTCGTCTGGT CAAGAGGAT TT C C CAT C T GT C CT G 90 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 TTGATGGAGCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTG 900 

Qy 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

I I II I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

Qy 961 CAT GGAT AC CT T GGT AACT T AT CAGCAGT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I 
Db 961 CAT GGAT AC CT T GGTAACT T AT CAGCAGT GT CAT C CT C AGAAGGAACAAT T GAAGAAAC T 1020 

Qy 1021 TTAAAT GAAGCT T CTAAAGAGTT GCCAGAGAGGGCAACAAAT CCAT TT GTAAAT AGAGAT 1080 

I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 1021 T TAAAT GAAGCTT CTAAAGAGT T GC CAGAGAGGG CAAC AAAT C CAT T T GTAAAT AGAGAT 108 0 

Qy 1081 T T AGC AGAAT T T T C AGAAT T AGAAT AT T C AGAAAT GGGAT CAT CT T T T AAAGG C T C C C CA 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I II M II I I I I I I I I 
Db 1081 T T AGC AGAAT T T T CAGAATT AGAAT AT T C AGAAAT GGGAT CAT C T T T T AAAG G CT C C C CA 1140 

Qy 1141 AAAGGAGAGT CAGC CAT ATT AGT AGAAAACAC T AAGGAAGAAGT AAT T GT GAGGAGT AAA 12 00 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I M I I I I I I 
Db 1141 AAAGGAGAGT CAGC CAT AT T AGT AGAAAACAC T AAGGAAGAAGT AAT T GT GAGGAGT AAA 1200 

Qy 1201 GACAAAGAGGAT T T AGT T T GTAGT GCAGC C CT T C AC AGT C CACAAGAAT C AC CT GT GGGT 1260 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I II I I I I I I I I I I I I I I I I 
Db 12 01 GAC AAAGAG GAT T T AGT T T GTAGT GCAGC C C T T CACAGT C CACAAGAAT C AC C T GT GGGT 1260 

Qy 1261 AAAGAAGACAGAGT TGTGTCTC C AGAAAAGACAAT G GACAT T T T TAAT GAAAT GC AGAT G 1320 



Db 1261 AAAGAAGAC AGAGT T GT GT C T C C AGAAAAGACAAT GGACAT T T TT AAT GAAAT GC AGAT G 1320 

Qy 1321 T C AGT AGT AGC AC C T GT GAGG GAAG AGT AT GCAGAC T T T AAG C CAT T T GAACAAGCAT GG 1380 

I I I I I I I I I I I I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I II I I II I I I I I I I I I I 
Db 1321 T CAGT AGT AGCAC CT GT GAG GGAAGAGTAT G C AGACT T TAAGC C ATT T GAAC AAG CAT GG 1380 

Qy 1381 GAAGT GAAAGAT ACT T AT G AG GGAAGT AG GGAT GTGCTGGCTGC T AGAG C T AAT GT GGAA 14 4 0 

I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I 
Db 1381 GAAGT GAAAGAT ACT TAT GAGGGAAGTAGGGAT GTGCT GGCT GCTAGAGCTAAT GT GGAA 1440 

Qy 14 41 AGT AAAGT G GAC AGAAAAT GCT T G GAAGAT AG C CT GGAGCAAAAAAGT C T T GGGAAGGAT 1500 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 AGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCT GGAGCAAAAAAGT CTT GGGAAGGAT 1500 

Qy 1501 AGT GAAG G C AGAAAT GAG GAT GCTTCTTTCCC C AGTAC C C C AGAACCT GT GAAGGAC AGC 1560 

I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 1501 AGT GAAG G CAGAAAT GAG GAT GCTTCTTTCCC CAGT AC C C C AGAACC T GT GAAGGAC AGC 1560 

Qy 1561 T C C AGAGC AT AT AT T AC CT GT G CT T C CTT T AC CT CAGC AAC C GAAAG C AC C AC AGC AAAC 1620 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1561 T C CAGAGC AT AT AT T AC CT GTGCTTCCTT T AC CT CAGCAAC C GAAAGCAC CAC AGCAAAC 1620 

Qy 1621 ACTTT CC CTTT GTTAGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAAATAGAA 1680 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I! I I I I I II I 
Db 1621 ACTTTCCCTTTGTTAGAAGATCATACTTCAGAAAATAAAACAGAT GAAAAAAAAATAGAA 1680 

Qy 1681 GAAAG GAAGGC C CAAAT T AT AAC AGAGAAGACT AGC CC CAAAAC GT CAAAT CCTTTCCTT 1740 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1681 GAAAG GAAG GC C CAAAT TAT AAC AGAGAAGACT AGC CC CAAAAC GT CAAAT CCTTTCCTT 1740 

Qy 17 41 GT AGC AGT ACAG GAT T C T G AG GC AGAT TAT GT T ACAAC AGAT AC CT T AT CAAAGGT GACT 18 00 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I II I I I I I I I I I II I I I 
Db 1741 GT AGC AGT ACAG GAT T CT GAG GC AGAT TAT GT T ACAAC AGAT AC CT T AT CAAAGGT GACT 18 00 

Qy 18 01 GAGGC AGC AGT GT CAAACAT G C CT GAAGGT C T GAC GC C AGAT T T AGT T C AGGAAGC AT GT 18 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I 
Db 1801 GAGGCAGCAGT GT CAAACAT GCCT GAAGGTCT GAC GCCAGATTTAGTT CAGGAAGCAT GT 1860 

Qy 18 61 GAAAGT GAAC T GAAT GAAG C CAC AGGT ACAAAGAT T GC T TAT GAAACAAAAGT GGACT T G 1920 

I I I I I I I I I I I I II I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 18 61 GAAAGT GAACT GAAT GAAG C CAC AGGT AC AAAGAT T GC T TAT GAAACAAAAGT G GACT T G 1920 

Qy 1921 GT CCAAAC AT C AGAAGCT AT ACAAGAAT CAC T TT AC CC CAC AGC AC AG CT T T GC C CAT C A 1980 

I I I I II I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1921 GT CCAAAC AT C AGAAGCT AT ACAAGAAT C ACT T T AC CC CAC AGCACAG CT T T GC C CAT C A 1980 

Qy 19 81 T T T GAG GAAGCT GAAGCAACT C C GT CAC C AGT T T T GCC T GAT AT T GT T AT G GAAGC AC C A 204 0 

I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 19 81 T T T GAG GAAGCT GAAGCAACT C C GT C AC CAGT T T T GCC T GAT AT T GT T AT G GAAGCAC C A 204 0 

Qy 2041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 2041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

Qy 2101 GAAGCAC CT C CT C CAGTTAGTT AT GACAGT ATAAAGCT T GAGCCT GAAAAC C CC CCACCA 2160 

I I I I I I I I 1 I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I 



Db 2101 GAAGC AC CT C CT C C AGT T AGTT AT GAC AGT ATAAAGCT T GAGC CT GAAAAC C C C C CAC C A 2160 

Qy 2161 TAT GAAGAAGCCAT GAAT GTAGCACT AAAAGCTTT GGGAACAAAGGAAGGAATAAAAGAG 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2161 TAT GAAGAAGCCAT GAAT GTAGCACTAAAAGCTTT GGGAACAAAGGAAGGAATAAAAGAG 2220 

Qy 2221 C CT GAAAGT T TTAAT GC AGCT GT T CAGGAAAC AGAAG C T C CT T AT AT AT C C ATT GC GT GT 228 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I II I I I I I I I I I I I 

Db 2221 CC T GAAAGT T TTAAT GC AGCT GT T CAGGAAAC AGAAGCT C C T TAT AT AT C C ATT GC GT GT 2280 

Qy 2281 GAT T T AAT TAAAGAAAC AAAGCT C T C C ACT GAGC CAAGT C CAGAT T T CT CTAATT ATT C A 234 0 

II I M I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 

Db 2281 GAT T TAAT TAAAGAAACAAAGC T CT C C ACT GAGC CAAGT C C AGAT TT CT CT AAT TAT T C A 234 0 

Qy 2341 GAAAT AGC AAAAT T C GAGAAGT CGGTGCCC GAACAC GCT GAGCTAGT GGAGGAT T C CT C A 2400 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 2341 GAAAT AGCAAAAT T C GAGAAGT C GGT GC C C GAACAC GCT GAGCTAGT GGAGGAT T C CT C A 2400 

Qy 2401 CCT GAAT C T GAAC C AGT T GACT TAT T T AGT GAT GAT T CGATT C CT GAAGT C C CACAAAC A 2460 

I I I I I I I I M II I I I I I I I II I II I I II I I I I I I I I I I I I I I I I I I I I I M II I || I I M 
Db 2401 CCT GAAT CT GAAC CAGT T GACT T ATT T AGT GAT GAT T CGAT T CCT GAAGT C C CAC AAAC A 2460 

Qy 2461 CAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GAAGT GTCTGAGACAGTAGCC 2520 

I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 2461 CAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GAAGT GTCTGAGACAGTAGCC 2520 

Qy 2521 C AGC ACAAAGAG GAGAGACTT AGT G C CT C AC C T CAG GAGC TAGGAAAGC CAT ATT TAGAG 258 0 

I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I 
Db 2521 CAG C ACAAAGAG GAGAGACT T AGT GC C T CAC C T C AGGAGCT AGGAAAGC CAT AT T TAGAG 2 58 0 

Qy 2581 T CT T T T CAGC C CAAT T T AC AT AGT ACAAAAGAT GCT GC AT C TAAT GAC AT T C CAACAT T G 264 0 

I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I II I I I M I I I I I I I 
Db 2581 T CTT T T CAG C C CAATT T AC AT AGT ACAAAAGAT GCT GC AT CT AAT GAC ATT C CAACAT T G 264 0 

Qy 2641 AC CAAAAAG GAGAAAAT T T C T T T G CAAAT GGAAGAGT TTAAT ACT GCAAT T T ATT CAAAT 2700 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 1 I I I I I I II I I I I I I I I I I I I I I 
Db 2641 AC CAAAAAG GAGAAAAT T T C T T T G CAAAT G GAAGAGT T TAAT ACT GCAAT T TAT T CAAAT 2700 

Qy 2701 GAT GAC T T AC T T T C T T C T AAG GAAGAC AAAAT AAAAGAAAG T GAAAC AT T T T CAGAT T C A 2760 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I 
Db 2701 GAT GACT TACT T T CT T CT AAG GAAGACAAAAT AAAAGAAAGT GAAACAT TT T CAGAT T C A 2760 

Qy 2761 T C T C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT CAGT GCTAAAGAT GATT C T C C T 2 82 0 

I I I I I I I I I II M I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 27 61 T C T C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT CAGT GCTAAAGAT GATT C T C CT 2820 

Qy 2821 AAAT T AGC CAAGGAGT AC ACT GAT CT AGAAG TAT C C GACAAAAGT GAAAT T G CT AAT AT C 2 880 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 2 821 AAAT T AGC CAAGGAGT AC ACT GAT C T AGAAG TAT C C GACAAAAGT GAAAT T GCT AAT AT C 2 880 

Qy 2881 CAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAAT 2 940 

I I I I I I I I I I I I II I I I I I I I I I I I II I I I II I I I I I I I I I M I I I I I II I I I I I I I I II 
Db 28 81 CAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAAT 294 0 

Qy 2941 AT AT AT C CT AAAGAT GAAGT ACAT GT T T CAGAT GAAT T C T C C GAAAAT AG GT C CAGT GT A 3000 

I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I 
Db 2941 AT AT AT C CT AAAGAT GAAGT AC AT GT T T CAGAT GAAT T CT C C GAAAAT AG GT C C AGT GT A 3000 



Qy 30 01 T CT AAG G CAT C CAT AT C G CCT T CAAAT GTCTCTGCTTTG GAAC CT C AGAC AGAAAT GGG C 3060 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
Db 3001 TCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCTTTGGAACCTCAGACAGAAATGGGC 3060 

Qy 30 61 AGCATAGTTAAAT CCAAATCACTTACGAAAGAAGCAGAGAAAAAACTT C CT T CT GAC ACA 3120 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 30 61 AGC AT AGT T AAAT C CAAAT CAC T T AC GAAAGAAGC AGAGAAAAAACT T C C T T C T GAC AC A 312 0 

Qy 3121 GAGAAAGAGGAC AGAT C CCT GT CAGCT GTATT GT CAGCAGAGCT GAGTAAAACTTCAGTT 3180 

I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 3121 GAGAAAGAGGAC AGAT C CCT GT CAGCT GTATT GT CAGCAGAGCT GAGTAAAACTTCAGTT 318 0 

Qy 3181 GT T GAC C T C CT C T AC T GGAGAGACAT TAAGAAGACT GGAGT GGT GTTTGGTGC CAGC T T A 324 0 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I II I I I I I I I 
Db 3181 GT T GAC CT CCT CT ACT G GAGAGACAT TAAGAAGAC T GGAGT GGT GTTTGGTGC CAGCT T A 324 0 

Qy 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

I I I I I I I I I II I I I I I I I I I I II 11 I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 32 41 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

Qy 3301 GCCCTGCTCTCGGT GACT AT C AG CT T T AGGAT AT ATAAGG GC GT GAT C C AG GC T AT C C AG 3360 

I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I 
Db 3301 GCCCTGCTCTCGGT GACT AT CAGCTTTAGGAT AT AT AAGGGCGT GAT CCAGGCTATCCAG 3360 

Qy 3361 AAAT C AGAT GAAGGC CAC C CAT T C AG GGCAT AT T TAGAAT CT GAAGT T G CT AT ATCAGAG 342 0 

I I II I II I M I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I 
Db 3361 AAAT CAGAT GAAG GC C AC C CAT T C AGGGCAT AT T TAGAAT C T GAAGT T GCT AT AT C AGAG 3420 

Qy 3421 GAAT T G GT T C AGAAAT AC AGT AAT TCTGCTCTTGGT CAT GT GAACAG CACAATAAAAGAA 34 80 

I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I 

Db 3421 GAAT T G GT T C AGAAAT AC AGTAAT TCTGCTCTT GGT C AT GT GAACAGCACAAT AAAAGAA 34 8 0 

Qy 3481 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATG 3540 

II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 34 81 CT GAG GCGGCTTTTCT T AGT T GAT GAT T T AGT T GAT T C CCT GAAGT T T G C AGT GTT GAT G 354 0 

Qy 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

Qy 3601 AT CT CAC T CT T C AGT AT T C CT GT TAT T TAT GAAC GGCAT C AG GT GC AGAT AGAT CAT TAT 3660 

I I I I I I I I I I I I I I I II I I t I I I I I I I I I I I I I I I I I I I I I I I I I I I I I t I M I II I I I I 
Db 3601 AT CT CAC T CT T C AGT AT T C C T GT T AT TT AT GAAC GGCAT C AG GT GC AGAT AGAT CAT TAT 3660 

Qy 3661 CT AG GAC T T GCAAACAAGAGT GT T AAGGAT G C CAT GGC CAAAAT C CAAGCAAAAAT C C CT 3720 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I II I I I I I I I I I 1 M I I I I I I I I 
Db 3661 CT AG GAC T T G CAAACAAGAGT GT T AAGGAT G C CAT GGC CAAAAT C CAAGCAAAAAT C C C T 3720 

Qy 3721 GGATTGAAGCGCAAAGCAGAT 3741 

I I I I I II I I I I I I I I I I I I 1 I 
Db 3721 GGATTGAAGCGCAAAGCAGAT 3741 
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ORIGIN 

Query Match 93.3%; Score 3489; DB 6; Length 3489; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 348 9; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II i I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 60 

Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAGGAGGAG 372 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I M I I I I I 
Db 61 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAG GAGGAG 120 

Qy 373 GAG GAGGAC GAGGAG GAG GAC GAC GAG G AC CT AGAGGAAC T GGAGGT GC T G GAGAG GAAG 432 

I I I I I I I II I I I I I I I I I I I I I II I I II I II I I I I I I I I I I I I I I I I M I I I I I I I I I I I 
Db 121 GAGGAGGAC GAGGAGGAG GAC GAC GAG GAC CT AGAGGAAC T GGAG GT GC T GGAGAG GAAG 180 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGAC 492 

I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I II I I I I I I I I 
Db 181 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGAC 2 40 

Qy 493 TTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCC 552 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I 
Db 241 TTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCC 300 

Qy 553 GCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCG 612 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 301 GCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCG 360 

Qy 613 CCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCC 672 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 CCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCC 42 0 



Qy 



673 CCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACG 732 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II 



Db 



421 CCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACG. 4 80 



Qy 733 CCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCT 792 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 CCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCT 540 

Qy 793 GC T GC AT C T GAGC CT GT GAT AC CCTCCTCTG C AGAAAAAAT TAT G GAT T T GAT GGAGC AG 852 

M I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I 
Db 541 GC T GC AT C T GAGC CT GT GAT AC CCTCCTCTG CAGAAAAAAT TAT GGAT T T GAT GGAGC AG 600 

Qy 853 CCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCT 912 

I I II I I I I I I II I I I i I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 CCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCT 660 

Qy 913 GCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTT 972 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 GCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTT 720 

Qy 973 GGTi\ACTTATCAGCAGTGTCATCCTCAGAAGGAACAATTGAAG7WVCTTT7\AATGAAGCT 1032 

I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I 
Db 721 GGTAACTTAT CAGCAGTGTCAT C CT CAGAAGGAACAAT T GAAGAAACTTTAAATGAAGCT 780 

Qy 1033 T CTAAAGAGT T G C CAGAGAGGG CAACAAAT C CAT T T GT AAAT AGAGATT T AGC AGAAT T T 1092 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I 
Db 781 T CTAAAGAGT T G C CAGAGAG GG CAACAAAT C C ATT T GT AAAT AGAGATT T AGC AGAATT T 840 

Qy 1093 TCAGAATTAGAATATTCAGAAATGGGATCATCTTTTAAAGGCTCCCCAAAAGGAGAGTCA 1152 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 T C AGAAT T AGAAT AT T C AGAAAT G G GAT CAT C T T T T AAAG G C T C C C C AAAAG G AG AGT C A 900 

Qy 1153 G C CAT AT T AGT AGAAAACACTAAGGAAGAAGT AAT T GT GAGGAGTAAAG ACAAAGAG GAT 1212 

I I I I I I I I I I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 901 G C CAT AT T AGT AGAAAACAC TAAGGAAGAAGT AAT T GT GAGGAGT AAAGACAAAGAGGAT 960 

Qy 1213 T T AGT T T GT AGT G C AG C C CT T CAC AGT C C ACAAGAAT CAC CT GT G GGT AAAGAAGACAGA 1272 

I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I 
Db 961 T T AGT T T GT AGT GC AG C C CT T CAC AGT C C ACAAGAAT C AC CT GT G GGT AAAGAAGACAGA 102 0 

Qy 1273 GTTGT GT CT C CAGAAAAGACAAT GGACATTT TTAAT GAAAT GCAGAT GT CAGTAGTAGCA 1332 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 
Db 1021 GTTGT GTCT CCAGAAAAGACAAT GGACATTT TTAAT GAAAT GCAGAT GT CAGTAGTAGCA 108 0 

Qy 1333 C CT GT GAG G GAAGAGT AT GC AGAC T T T AAG C CAT T T GAACAAGC AT GGGAAGT GAAAGAT 1392 

I I I I II II I I I I I I I I I I I I I II I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1081 C CT GT GAG G GAAGAGT AT G C AGAC T T T AAG C CAT T T GAACAAGC AT GGGAAGT GAAAGAT 114 0 

Qy 1393 AC TT AT GAG GGAAGT AGGGAT GT G C T GGCT G CT AGAGCT AAT GT G GAAAGTAAAGT GGAC 1452 

I I I I I I I I I I I I I I I I I I I I I I 1 II I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1141 ACTTAT GAGGGAAGTAGGGATGT GCT GGCT GCT AGAGCT AAT GTG GAAAGTAAAGT GGAC 1200 

Qy 1453 AGAAAAT GCT T GGAAGAT AGC CT GGAGCAAAAAAGT C T T G GGAAGGAT AGT GAAG G CAGA 1512 

I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I 
Db 1201 AGAAAAT GCT T GGAAGAT AGC CT GGAGCAAAAAAGTCT TGGGAAGGATAGT GAAGGCAGA 1260 

Qy 1513 AAT GAG GAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAG GAC AGCT C C AGAG CAT AT 1572 

I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I M II II I I I I I I 
Db 1261 AAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC C T GT GAAGGACAGCT C C AGAGC AT AT 1320 



Qy 1573 ATTACCTGTGCTTCCTTTACCTCAGCAACCGAAAGCACCACAGCAAACACTTTCCCTTTG 1632 

I I I I I I I I I II I I I I I 1 I I I I I I I I M I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I 
Db 1321 ATTACCTGTGCTTCCTTTACCTCAGCAACCGAAAGCACCACAGCAAACACTTTCCCTTTG 1380 

Qy 1633 T T AGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAAAT AGAAGAAAGGAAGGC C 1692 

I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 TTAGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAAAT AGAAGAAAGGAAGGCC 14 4 0 

Qy 1693 CAAAT T AT AAC AGAGAAGACT AG C C C CAAAAC GT CAAAT CCTTTCCTT GT AGC AGTAC AG 1752 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I II I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1441 CAAAT TAT AAC AGAGAAGACT AGC C C CAAAAC GT CAAAT CCTTTCCTT GT AGC AGTAC AG 1500 

Qy 1753 GAT T C T GAGGC AGATT AT GT T ACAAC AGAT AC C T TAT CAAAGGT GAC T GAGGC AGCAGT G 1812 

I I I I I I I I I I I I II I I M I I II I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I I I I I 
Db 1501 GAT T CT GAGGC AGATT AT GT T AC AAC AGAT AC C T TAT CAAAGGT GACT GAGGC AGCAGT G 1560 

Qy 1813 T CAAAC AT G C CT GAAGGT CT GAC G C C AGAT T T AGT T CAG GAAGC AT GT GAAAGT GAACT G 1872 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1561 T CAAAC AT GC CT GAAGGT CT GAC GC C AGAT T T AGT T CAGGAAGCAT GT GAAAGT GAACT G 1620 

Qy 1873 AAT GAAGC C AC AGGTACAAAGAT T G CT T AT GAAACAAAAGT GGACTT GGT C CAAAC AT C A 1932 

i I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1621 AAT GAAG C C AC AG GT ACAAAGAT T G CT T AT GAAACAAAAGT GGACTT GGT C CAAACAT C A 1680 

Qy 1933 GAAG CT AT AC AAGAAT CACT T T AC C C CAC AGC AC AGCT T T GC C CAT CAT T T GAG GAAGC T 1992 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1681 GAAGCT AT AC AAGAAT CACT T T AC C C CAC AGC AC AGCT T T GC C CATC AT T T GAGGAAGC T 1740 

Qy 1993 GAAGCAAC T C C GT CAC CAGT T T T GC C T GAT AT T GT TAT G GAAGC AC CAT T AAAT T CT CT C 2 052 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 1741 GAAGCAACT C C GT CAC CAGT T T T G C C T GAT AT T GT TAT GGAAGC ACC AT T AAAT T CT C T C 1800 

Qy 2053 CTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCT 2112 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I M I I I I I II I I I I I I I I I 
Db 1801 CTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCT 1860 

Qy 2113 CCAGTTAGTTAT GACAGTATAAAGCTTGAGCCT GAAAACCCCCCACCATATGAAGAAGCC 2172 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 1861 C C AGTT AGT TAT GAC AGT AT AAAGCT T GAG C CT GAAAAC C C CC CACC AT AT GAAGAAGC C 192 0 

Qy 2173 ATGAATGTAGCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAGCCTGAAAGTTTT 2232 

I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 1921 ATGAATGTAGCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAGCCTGAAAGTTTT 1980 

Qy 2233 AAT GC AG C T GT T C AGGAAAC AGAAG CT C CT TAT AT AT C CAT T G CGT GT GAT T TAAT TAAA 2292 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1981 AAT GC AGCT GT T CAG GAAAC AGAAGCT C C T TAT AT AT C CAT T GCGT GT GAT T TAAT TAAA 2040 

Qy 2293 GAAACAAAGCT CT C CACT GAGC C AAGT C C AGAT T T CT CT AATT AT T C AGAAAT AGCAAAA 2352 

I 1 I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I M 
Db 2041 GAAACAAAG CT C T C CACT GAGC C AAGT C CAG AT T T CT CT AATT AT T C AGAAAT AGC AAAA 2100 

Qy 2353 T T C GAGAAGT C G GT G C C C GAACAC G CT G AGCT AGT G GAG GATT CCT C AC CT GAAT CT GAA 2412 

I I I I I I I I I I I I II I I I I I I II I I M I I I II I I I I I I I I I I I I I I I I II I I I I I M I I I I 
Db 2101 T T C GAGAAGT CGGT G C C C GAAC AC GC T GAG C T AGT GGAGGAT T CCT CAC CT GAAT CT GAA 2160 



Qy 2413 C CAGT T GACT TAT T T AGT GAT GAT T C GAT T C C T GAAGT CC C ACAAAC ACAAGAGGAGG CT 2472 

I I II I I I I I I I I I I I I I I I I I I I 1 M I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 2161 C C AGT T GACT TAT T T AGT GAT GAT T C GAT T C CT GAAGT CC CACAAACACAAGAGGAGGCT 2220 

Qy 2473 GT GAT G C T CAT GAAG GAGAGT CT C AC T GAAGT GT C T GAGAC AGT AG C CCAG C AC AAAGAG 2532 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 2221 GT GAT G C T CAT GAAG GAG AGT CT C AC T GAAGT GT CT GAGACAGT AGC C C AGCACAAAGAG 228 0 

Qy 2533 GAGAGAC T T AGT GC C T CAC CTC AGGAG C TAG GAAAGC CAT AT T T AGAGT CT T T T C AGC C C 2592 

I I I I I II I I I 1 I I I I II I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 22 81 GAGAGAC T T AGT GC C T CAC C T CAGGAG CT AG GAAAGC CAT AT T T AGAGT CT T T T C AGC C C 234 0 

Qy 2593 AATT T ACAT AGT ACAAAAGAT GCT GC AT CT AAT GACAT T C CAACAT T GAC CAAAAAGGAG 2652 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 2341 AAT T T AC AT AGT ACAAAAGAT GCT GC AT CT AAT GACAT T C CAACAT T GAC CAAAAAGGAG 24 00 

Qy 2653 AAAAT T T CT T T GCAAAT GGAAGAGT T T AAT ACT G CAAT TT AT T CAAAT GAT GACT TACT T 2712 

I I I I I M II I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 24 01 AAAAT T T CT T T GCAAAT GGAAGAGT T T AAT AC T G CAAT TT AT T CAAAT GAT GACT TACT T 2 4 60 

Qy 2713 T C TT CTAAG GAAGACAAAATAAAAGAAAGT GAAACATT TT C AGAT T CAT CT C C GAT T GAG 2772 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I II I I 
Db 2461 T C T T C TAAGGAAGAC AAAAT AAAAGAAAGT GAAAC AT T T T C AGAT T CAT C T C C GAT T GAG 2520 

Qy 2773 AT AAT AGAT GAAT T T C C CAC GT T T GT CAGT G CT AAAGAT GAT T CT C CT AAAT TAG C CAAG 2832 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I 
Db 2521 AT AAT AGAT GAAT T T C C CAC GT T T GT CAGT G CT AAAGAT GAT T CT C CT AAAT TAG C CAAG 258 0 

Qy 2833 GAGT AC AC T GAT CT AGAAGT AT C C GACAAAAGT GAAAT T GC TAAT AT CCAAAGC G GGG C A 2 8 92 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I 
Db 2581 GAGT ACAC T GAT C T AGAAGT AT C C GACAAAAGT GAAAT T GCTAAT AT CCAAAGC G GGGC A 2 64 0 

Qy 28 93 GAT T CAT TGCCTTGCT T AGAAT T GC C CT GT GAC CT T T C TT T CAAGAAT AT AT AT C CT AAA 2 952 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 2641 GATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTT CAAGAAT AT AT AT CCTAAA 2700 

Qy 2 953 GAT GAAGT ACAT GT T T CAGAT GAAT T CT C CGAAAAT AGGT C CAGT GT AT CTAAGGCAT C C 3012 

I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2701 GAT GAAGT AC AT GT T T CAGAT GAAT T CT C CGAAAAT AG GT C CAGT GT AT CTAAGGCAT C C 2760 

Qy 3013 AT AT C G C C T T CAAAT GT CTCTGCTTT GGAAC CT CAGAC AGAAAT G GG C AGC AT AGTT AAA 3072 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I 
Db 2761 AT AT C GC C T T CAAAT GT CTCTGCTTT GGAAC C T CAGAC AGAAAT G GG C AGC AT AGT T AAA 2 820 

Qy 3073 T CCAAAT CACT T ACGAAAGAAGCAGAGAAAAAACTT C CTT CT GACACAGAGAAAGAGGAC 3132 

I I I I I I I I I I I M I I I I I I II M I I I M I I I I I M I I I I I I I I I I I I I I I I II II I I I I I 
Db 2 821 T C CAAAT CACTT AC GAAAGAAGCAGAGAAAAAACTT C CTT CT GACACAGAGAAAGAGGAC 28 8 0 

Qy 3133 AGATCCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTC 3192 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I 
Db 28 81 AGATCCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTC 2 94 0 

Qy 3193 TACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTG 3252 

I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I M 
Db 2 941 TACT GGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCT GCT GCT G 3000 



Qy 



3253 TCTCTGACAGTGTTCAGCATTGT CAGT GTAACGGCCTACATTGC CTT GGCCCT GCT CTCG 3312 



3001 TCTCTGACAGTGTTCAGCATTGTCAGTGT7\ACGGCCTACATTGCCTTGGCCCTGCTCTCG 30 60 



Qy 


3313 


GT GACT AT C AGCT T T AGGATAT AT AAG G GC GT GAT C C AGGCT AT C C AGAAAT CAGAT GAA 


O O "~i o 




......ill i t i i i < i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i t i t i 

1 I I I M 1 1 1 1 1 1 1 1 M M 1 M 1 1 1 1 II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 M 1 1 M 




Db 


3061 


GT GACT AT CAGCT T T AGGATAT AT AAG GGC GT GAT C CAGGCT AT C C AGAAAT CAGAT GAA 


3120 


Qy 


3373 


GGC C AC C CAT T C AGG GCAT ATT T AGAAT CT GAAGT T GC T AT AT C AGAG GAAT T GGT T C AG 


3432 




II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3121 


GGC CAC C CAT T C AGG GCAT ATT T AGAAT CT GAAGT T GC T AT AT C AGAG GAAT T GGT T C AG 


3180 


Qy 


3433 


AAATACAGTAATT CT GCT CTTGGTCAT GT GAACAGCACAATAAAAGAACT GAGGCGGCTT 


3492 




1 || 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 




Db 


3181 


AAATACAGTAATT CT GCT CTTGGTCAT GT GAACAGCACAATAAAAGAACT GAGGCGGCTT 


3240 


Qy 


3493 


T T CT T AGT T GAT GAT T T AGT T GAT T C C CT GAAGT T T G C AGT GT T GAT GT G G GT GTT T ACT 


3552 




1 I I I I I || 1 I 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3241 


T T CT T AGT T GAT GAT T T AGT T GAT T C C CT GAAGT T T GC AGT GT T GAT GT G G GT GTT T AC T 


3300 


Qy 


3553 


TATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTC 


3612 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


3301 


TATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTC 


3360 


Qy 


3613 


AGT AT T C CT GT TAT T TAT GAAC GG C AT C AGGT GC AGAT AGAT CAT TAT C T AGGACT T GC A 


3672 




1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


3361 


AGT AT T C C T GT T AT T TAT GAAC GGC AT CAGGT GCAGAT AGAT CAT TAT CT AGGACT T GC A 


3420 


Qy 


3673 


AACAAGAGT GTT AAG GAT GC CAT GGC CAAAAT CCAAGCAAAAAT C CC T GGAT T GAAGC G C 


3732 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




Db 


3421 


AACAAGAGT GT TAAGGAT GC CAT GGC CAAAAT C CAAGC AAAAAT C CC T GGAT T GAAGC G C 


3480 


Qy 


3733 


AAAG CAGAT 3741 




Db 


3481 


1 1 1 1 1 1 1 1 1 
AAAG CAGAT 34 89 
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AY102284 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 
TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 



AY102284 4627 bp mRNA linear ROD 29-JAN-2003 

Mus mus cuius RTN4 (Rtn4) mRNA, complete cds, alternatively spliced. 
AY102284 

AY1022 84 . 1 GI: 23379816 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 4627) 

Oertle,T., Huber,C, van der Putten,H. and Schwab, M.E. 

Genomic Structure and Functional Characterisation of the Promoters 

of Human and Mouse nogo/rtn4 

J. Mol. Biol. 325 (2), 299-323 (2003) 

22376540 

12488097 

2 (bases 1 to 4627) 
Oertle,T. and Schwab,M.E. 
Direct Submission 



JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 



gene 

5 1 UTR 

CDS 



3 ! UTR 
ORIGIN 



Submitted ( 07-MAY-2002 ) Brain Research Institute, University of 

Zurich and ETH Zurich, Winterthurerstr . 190, Zuerich 8057, 

Switzerland 

3 (bases 1 to 4627) 

Van der Putten,H. 

Direct Submission 

Submitted ( 07-MAY-2002 ) Nervous System Research, Novartis Pharma 
Inc., Basel, Switzerland 

Location/Qualif iers 

1. .4627 

/organism="Mus musculus" 

/mol_type= n mRNA n 

/strain="129/SvcJ7 M 

/db_xref="taxon: 10090" 

/ chromosome= n ll" 

1. .4627 

/gene="Rtn4" 

/note="synonym: nogo" 

1. .248 

/gene="Rtn4" 

249. .3737 

/gene="Rtn4" 

/note="NOGO-A; RTN4-A; alternatively spliced" 

/codon_start=l 

/product="RTN4" 

/protein_id="AAM73506. 1" 

/db_xref="GI: 23379817" 

/ trans lation="MEDIDQSSLVSSSADSPPRPPPAFKYQFVTEPEDEEDEEDEEEE 
EDDEDLEELEVLERKPAAGLSAAPVPPAAAPLLDFSSDSVPPAPRGPLPAAPPTAPER 
QPSWERSPAASAPSLPPAAAVLPSKLPEDDEPPARPPAPAGASPLAEPAAPPSTPAAP 
KRRGSGSVDETLFALPAASEPVI PS SAEKIMDLKEQPGNTVS SGQEDFPSVLFETAAS 
LPSLSPLSTVSFKEHGYLGNLSAVASTEGTIEETLNEASRELPERATNPFVNRESAEF 
SVLEYSEMGSSFNGSPKGESAMLVENTKEEVIVRSKDKEDLVCSAALHNPQESPATLT 
KVVT<EDGVMSPEKTMDIFNEMKMSVVAPVREEYADFKPFEQAW 

RANMESKVDKKCFEDSLEQKGHGKDSESRNENASFPRTPELVKDGSRAYITCDSFSSA 

TESTAANIFPVLEDHTSENKTDEKKIEERKAQIITEKTSPKTSNPFLVAIHDSEADYV 

TTDNLSKVTEAWATMPEGLTPDLVQEACESELNEATGTKIAYETKVDLVQTSEAIQE 

SIYPTAQLCPSFEEAEATPSPVLPDIVMEAPLNSLLPSTGASVAQPSASPLEVPSPVS 

YDGIKLEPENPPPYEEAMSVALKTSDSKEEIKEPESFNAAAQEAEAPYISIACDLIKE 

TKLSTEPSPEFSNYSEIAKFEKSVPDHCELVDDSSPESEPVDLFSDDSIPEVPQTQEE 

AVMLMKESLTEVSETVTQHKHKERLSASPQEVGKPYLESFQPNLHITKDAASNEIPTL 

TKKETISLQMEEFNTAIYSNDDLLSSKEDKMKESETFSDSSPIEIIDEFPTFVSAKDD 

SPKEYTDLEVSNKSEIANVQSGANSLPCSELPCDLSFKNTYPKDEAHVSDEFSKSRSS 

VSKVPLLLPNVSALESQIEMGNIVKPKVLTKEAEEKLPSDTEKEDRSLTAVLSAELNK 

TSWDLLYWRDIKKTGWFGASLFLLLSLTVFSIVSVTAYIAIJUiLSvTISFRIYKGV 

IQAIQKSDEGHPFRAYLESEVAISEELVQKYSNSALGHVNSTIKELRRLFLVDDLVDS 

LKFAVLMWFTYVGALFNGLTLLILALISLFSIPVIYERHQAQIDHYLGLANKSVKDA 

MAKIQAKIPGLKRKAE" 

3738. .4627 

/gene="Rtn4" 



Query Match 85.6%; 
Best Local Similarity 92.8%; 
Matches 3488; Conservative 



Score 3202.4; 
Pred. No. 0; 

0; Mismatches 



DB 10; Length 4 627; 
211; Indels 



59; Gaps 10; 



Qy 



2 TTGCTCGTCTGGGC-GGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 



Db 16 TTGCTCATCTGGGCGGGCGGCGGCTGCTGCAACTGAGGACAGGGCGGGTGGCGCATCTCG 75 

Qy 61 ATCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 120 

I I I I I I I I II I II II I I I I I I I I I I I I I lllll I I I I I I I III I I I I I I 
Db 76 AGCGCGGAGGCAGGAGGAGAAGTCTTATTGTTCCTGGAGCTGTCGCCTTTGCGGGTTCCT 135 

Qy 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

lllll II I I I I I I III I I I I I I I I I I I I I II I I I I II II I I I I I 

Db 136 CGGCTTGG TTCGGCCAGCCCGGCCTCTGCCAGTCCTGCCCAACCCCCACA 185 

Qy 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 24 0 

I I I I I I I I I I I I I I I II I I I I I I lllll I I I I II I I I I I I I I I II I II I I III I 

Db 186 ACCGCCCGCGGCTCTGAGGAGAAGTGGCCC-GCGGCGGCAGTAGCTGCAGCATCATCGCC 244 

Qy 241 GAC C C G C CAGCCAT G GAAGAC AT AGAC C AGT C GT C GCT GGT CTCCTCGTC C AC G G AC AGC 300 

II I I I I II I I I I I I I I II I I I I I I I I I I I I I I II I I I II I I I I MM III 

Db 245 GA CCATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCGCGGATAGC 2 96 

Qy 301 CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 

I II I I I I I I I I I I I I I I I I I I M I I II II I I II I I I I I I I II I I I II I II II I I I I II 
Db 297 CCGCCCCGGCCCCCGCCCGCTTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 356 

Qy 361 GAC GAGGAGGAGGAGGAGGAC GAGGAGGAGGACGACGAGGACCTAGAGGAACT GGAGGT G 42 0 

I II I II I I II II I I I I I I II II I II II I I I I I I I II II I I I I II I I I M I I I I 
Db 357 GAC GAGGAAGAC GAGGAG GAGGAGGAGGAC GACGAGGACCT GGAGGAATT GGAGGT G 413 

Qy 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 4 80 

I II I I II I I I I I I I I I I I II I II I I II II I I I III II I I I I I I II I I I I II I I 
Db 414 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCGGCTCCGGT GCCCCCCGCCGCCGCA 4 70 

Qy 4 81 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 54 0 

I II I I II I I I I I I I I II I II I I I II II II II I I I I I I I II I I I I I I I I II II II I II II I 
Db 471 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 530 

Qy 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

I II I I I I I I I I I I I I I II I I II II II I I II I I I I II I I I I I I I I I I I I II lllll 
Db 531 GCGCCCCCCACCGCCCCTGAGAGGCAGCCGTCCTGGGAACGCAGCCCCGCGGCGTCCGCG 590 

Qy 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

I II I I II I I I I I I I I I I I I II I I II I I II II I I I I I I I I I I I I I I I I I I I I II I II I I I 

Db 591 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCGGAGGACGACGAGCCT 650 

Qy 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 72 0 

II III I I II I II I I II I I II I II I II I I I I I I I I I I I I I I I I II II I I I I 

Db 651 CCAGCG CGGCCTCCGGCGCCAGCCGGCGCGAGCCCCCTAGCGGAGCCCGCCGCG 704 

Qy 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 7 80 

I II I I II I I I I I I II II I I I I I I II II II II I I II I I I II I I I I I I I I I I I II II I I I I 
Db 7 05 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCGGGCTCAGTGGATGAGACCCTT 764 

Qy 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

I I I II I I II I I I M I I I I II I II I I I I II II I I II I I I I I I I I I I I I I I I I I II II I I I I 
Db 7 65 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 82 4 

Qy 841 TTGATGGAGCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTG 900 

I II I I I I I I I I I II I I II I I I I II I I II II I I II I I I II I I I I I I I I I I I I II I M I I I 



Db 825 T T GAAG GAGC AG C C AG GT AAC ACT GTTTCGTCT GGT CAAGAGGAT T T C C CAT CT GT C CT G 884 

Qy 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db .8 85 TTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 944 

Qy 961 CAT GGATACCTT GGTAACTT AT C AGCAGT GT CAT CCT CAGAAGGAACAATT GAAGAAACT 1020 

II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I 1 II I I I II I I I I I I I I 
Db 945 C AC G GAT ACC TT GGT AACT T AT C AG CAGT GGC AT C CAC AGAAGGAACT AT T GAAGAAACT 1004 

Qy 1021 T T AAAT GAAGCT T CTAAAGAGT T G C C AGAGAGGGCAACAAAT C CAT T T GT AAAT AGAGAT 108 0 

I I I I I I I II I I I I I I I III I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1005 T TAAAT GAAGCT T CT AGAGAAT T GC C AGAGAGG GCAACAAAT C CATT T GT AAAT AGAGAG 1064 

Qy 1081 T TAG CAGAAT TT T C AGAAT T AGAAT AT T CAGAAAT GGGAT CAT CT T T T AAAGGCT C CC C A 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II II MINIMI 
Db 1065 T C AGC AGAGT TT T CAGT AT T AGAAT AC T CAGAAAT GGGAT CAT CT T T CAAT GG CT C C C C A 112 4 

Qy 1141 AAAGGAGAGTCAGCCATATTAGTAGAAAACACTAAGGAAGAAGTAATTGTGAGGAGTAAA 12 00 

I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I 
Db 1125 AAAGGAGAGTCAGCCATGTTAGTAGAAAACACTAAGGAAGAAGTAATTGTGAGGAGTAAA 1184 

Qy 1201 GACAAAGAGGATTTAGTTT GTAGT GCAGC CCTT CACAGT CCACAAGAAT CACCT 1254 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I 
Db 1185 GACAAAGAGGATTTAGTTT GTAGT GCAGC C CTT CATAAT CCACAAGAGT CACCTGC GACC 1244 

Qy 1255 GT GGGTAAAGAAGACAGAGT T GT GT CT C C AGAAAAGACAAT GGAC AT TT T T 1305 

I I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I M I M I I I I I I I I I I 
Db 1245 CT T AC T AAAGT GGT T AAAGAAGAC GGAGT TAT GT CT C C AGAAAAGACAAT G GAC AT T T T T 1304 

Qy 1306 AAT GAAAT GCAGAT GT CAGT AGT AG CAC C T GT GAG G GAAGAGT AT G C AGACT T T AAG C C A 1365 

I I I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I M I II I I I I I I I 
Db 1305 AAT GAAAT GAAAAT GT CAGT GGT AG CAC C T GT GAG G GAAGAGTAT GCAGAT T T TAAGC CA 1364 

Qy 1366 T T T GAACAAG CAT G GGAAGT GAAAGAT AC T TAT GAG GGAAGT AGGGAT GT GCTGGCTGCT 1425 

I I I I I I I I I II I I I I I I I II I I I II I I I M I II I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 1365 TT T GAACAAGCATGGGAAGT GAAAGATACTT AT GAGGGAAGT AGGGAT GT GCT GGCT GCT 1424 

Qy 142 6 AGAGCT AAT GT GGAAAGT AAAGT G GAC AGAAAAT GCT T G GAAGAT AGC C T G GAGCAAAAA 14 85 

I I I I I I I I I II I I I I I I I I I II II I I I I II I I I I I I I I I I I I I I I I I I I II I M I II 
Db 1425 AGAGCT AAT AT GGAAAGTAAAGT GGACAAAAAAT GCTT T GAAGAT AGCCT GGAGCAAAAA 1484 

Qy 1486 AGT CTT GG GAAGGAT AGT GAAG G CAGAAAT GAG GAT GCTT CT T TC CC C AGT AC C C CAGAA 154 5 

Ml II II I II I I II II I II I II II II I II I I II M I I I I I I I I II I II II II II I 
Db 14 85 G GT CAT GG GAAG GAT AGT GAAAGCAGAAAT GAGAAT GCTTCTTT C CC C AGGAC C C CAGAA 154 4 

Qy 1546 CCTGTGAAGGACAGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCTCAGC7\ACCGAA 1605 

I I M II II I M II I I I II M I II II I II I II I II I I I I I I II I II I I II I I I 

Db 1545 C T T GT GAAGGAC GGCT C C AGAGC GT AC AT C AC CT GT GAT T C CT T T AGCT C AGCAAC C GAG 1604 

Qy 1606 AGC AC CAC AGCAAACAC TTTCCCTTT GT T AGAAGAT CAT ACT T C AGAAAAT AAAACAGAT 1665 

II M II II I II II I I I II I I II I I II II II I I II I II II II I I I I I I II I II 

Db 1605 AGT ACT GC AG CAAACAT TT T C C C T GT GC T AGAAGAT CACACT T C AGAAAACAAAACAGAT 1664 

Qy 1666 GAAAAAAAAAT AGAAGAAAGGAAGGC C CAAAT T AT AAC AGAGAAGACTAGC C C CAAAAC G 1725 

I II M I I II II I M I II II I I II II II I II II II I I I I I II I I I M I I I II I II I I I I I I 
Db 1665 GAAAAAAAAAT AGAAGAAAGGAAGGC C CAAAT T ATAAC AGAGAAGAC TAG C C C CAAAAC G 1724 



Qy 1726 T CAAAT CCTTTCCTT GT AGC AGT AC AG GATT CT GAGGC AGAT TAT GT T ACAACAGAT AC C 17 85 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1725 T CAAAT CCTTTCCTT GT AGCAAT AC AT GAT T CT GAGG C AGAT TAT GT C ACAACAGAT AAT 1784 

Qy 1786 T TAT CAAAGGT GACT GAG G CAGCAGT GT CAAAC AT GC CT GAAGGT C T GAC G C CAGAT T T A 1845 

I I I I I I I I II I I I I I I I I I I I I I I I I III II I I I I I I I I I II I I I I I II I I I I I I I 
Db 1785 T TAT CAAAGGT GACT GAGGCAGT AGT GGCAAC CAT GCCT GAAGGT CTAACGCCAGATTTA 1844 

Qy 1846 GT T CAG GAAGCAT GT GAAAGT GAACT GAAT GAAGC CAC AGGT ACAAAGAT T GCT T AT GAA 1905 

I I I I I II I I I I I I I I I I I I I I I I I I I I II It II I I I I II I I I I I I I I I I I I I II I I I I I 
Db 184 5 GT T CAG GAAGCAT GT GAAAGT GAACT GAAC GAAGC CAC AGGT ACAAAGATT GCT TAT GAA 1904 

Qy 1906 ACAAAAGT GGACT T GGT C CAAAC AT C AGAAG CT AT AC AAGAAT C ACT TT AC C CC ACAG C A 1965 

I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I II I I I I I I III I I I I I I I I I I I I I I 
Db 1905 ACAAAAGT GGACT T GGT CCAGAC AT CAGAAGCT ATACAAGAGT CAATTTACC CCACAGCA 1964 

Qy 1966 C AGCT T T G C C CAT CAT T T GAGGAAG CT GAAG CAACT C C GT CAC C AGT TT T GCCT GAT AT T 2025 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I 
Db 1965 CAGCTTTGCC CAT CATTTGAGGAAGCTGAAGCAACTCCGT CAC CAGTTTT GCCT GAT ATT 2 024 

Qy 202 6 GTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCC 2 085 

I II I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I II I II I I II I II I I I I 
Db 202 5 GTTATGGAAGCGCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCC 2 084 

Qy 208 6 AGTGTATCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTAT7WVGCTTGAGCCT 2145 

I I I I I I I I I I I I I I I II III I I I I I I I I I II I II I I I I I I I I I I II I I I I I I I I 
Db 2085 AGTGCATCCCCACTAGAAGTACCGTCTCCAGTTAGTTATGACGGTATAAAGCTTGAGCCT 2144 

Qy 214 6 GAAAAC C C C C CAC CAT AT GAAGAAG C CAT GAAT GTAGC ACT AAAAGCTT T G GGAACAAAG 22 05 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I II I I II I 
Db 2145 GAAAAT C C C C CAC CAT AT GAAGAAG C CAT GAGT GTAGC ACT AAAAAC AT C GGACT CAAAG 2204 

Qy 2206 GAAGGAATAAAAGAG C CT GAAAGT T TT AAT G C AGCT GT T C AGGAAAC AGAAGCT CCT T AT 22 65 

I I I I III I I I I I I I I I I I I I I II I I I I I I I I I II I I II I I I I II I I I II I I I I I II 
Db 22 05 GAAGAAATTAAAGAGCCTGAAAGTTTTAATGCAGCTGCTCAGGAAGCAGAAGCTCCTTAT 2264 

Qy 22 66 AT AT C CAT T GC GT GT GAT T T AAT T AAAGAAACAAAGCT CT C C ACT GAGC CAAGT C CAGAT 2 325 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 2265 AT AT CCATT GCAT GT GATTTAATTAAAGAAACAAAGCT CT CCACT GAGCCAAGT CCAGAG 2324 

Qy 232 6 T T CT CT AAT T ATT C AGAAAT AGCAAAAT T C G AGAAGT C GGT GC C C GAAC AC GCT G AGC T A 2385 

I I I I I I I I I I I I I I I II I I I I I I I I I I II II I I I I I I I I II I I II III I I I I I I 
Db 2325 TT CT CTAATT ATT C AGAAAT AGCAAAAT TT GAGAAGT C GGT GCCT GATCACT GT GAGCT C 2384 

Qy 238 6 GT GGAGGAT T C CT CAC C T GAAT CT GAAC CAGT T GACT TAT T T AGT GAT GAT T C GAT T C C T 2445 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 238 5 GT GGAT GAT T C CT CAC C C GAAT CT GAAC CAGT T GACT TAT T T AGT GAT GAT T C AAT T C C T 2 444 

Qy 244 6 GAAGT C CC ACAAAC ACAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT C ACT GAAGT G 2505 

I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I II I I I M I I 
Db 24 4 5 GAAGT CCCACAAACACAAGAGGAGGCT GT GAT GCT AAT GAAGGAGAGT CT CACT GAAGT G 2504 

Qy 2506 T C T GAGAC AGT AG C C CAG CAC AAA GAG GAGAGAC T T AGT GCCT CAC CT C AGGAGCT A 2562 

I I I I I I I I I II I I II I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M 

Db 2505 T CTGAGACAGTAACACAACACAAACATAAGGAGAGACTTAGT GCT T CAC CT CAGGAGGT A 2564 



Qy 2563 G GAAAG C CAT AT T T AGAGT CT T T T C AGC C CAAT TT AC AT AGT ACAAAAGAT GCT GCAT CT 2622 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 2565 G GAAAGC CAT AT T T AGAGT CT T T T C AG C C C AAT TT ACAT AT T ACAAAAGAT GC T G CAT CT 2 624 

Qy 2 623 AAT GAC AT T C C AAC AT T GAC C AAAAAGG AGAAAAT T T C T T T G C AAAT G G AAGAGT T T AAT 2682 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I II 
Db 2 625 AAT GAAAT T C CAAC AT T GAC CAAAAAG GAGACAAT TT C T T T GC AAAT GGAAGAGT T T AAT 2 684 

Qy 2683 AC T G CAAT TT AT T C AAAT GAT GAC T T ACT TT CT T CT AAG GAAGACAAAAT AAAAGAAAGT 2742 

I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I II 

Db 2685 AC T GCAAT TT AT T C CAAT GAT GACT T AC T T T CT TCTAAGGAAGAC AAAAT GAAAGAAAGT 2744 

Qy 2743 GAAACATTTT CAGATT CAT CT C CGATT GAGATAATAGAT GAATTT CC CAC GTTT GTCAGT 2802 

II I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I II I II I I I I I I 

Db 2745 GAAAC AT T TT C C GAT T CAT C T C C CAT T GAGATAATAGAT GAGT T T C C CAC AT T T GT C AGT 2804 

Qy 2 8 03 GC T AAAGAT GAT T C T C C TAAAT TAG C CAAGGAGT ACACT GAT CT AGAAGT AT C C GAC AAA 2 862 

I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 8 05 GCT AAAGAT GAT T CT C C T AAGGAGT ACACT GAC CTAGAAGTATC CAAC AAA 2 855 

Qy 2 8 63 AGTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGT 2 922 

I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 2 856 AGTGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGT 2 915 

Qy 2923 GAC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAGT AC AT GT T T C AGAT GAAT T CT C C 2982 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
Db 2 916 GAC CT T T CTT T CAAGAAT ACAT AT C C T AAAGAT GAAGC AC AT GT CT CAGAT GAAT T CT C C 2 975 

Qy 2 983 GAAAAT AGGT CC AGT GT AT C T AAG GCAT C CAT AT C GC CT TCAAAT GT CT CT GC T T T GGAA 3042 

I II I I I I I I I I I I I I I I I I I I I I II III II I I I I I I I I I I I I I I I I I I I I 
Db 2976 AAAAGTAGGTCCAGTGTATCTAAGGTGCCCTTATTGCTTCCAAATGTTTCTGCTTTGGAA 3035 

Qy 3043 C CT CAGAC AGAAAT GGGC AGC AT AGT TAAAT C CAAAT C ACT T AC GAAAGAAGC AGAGAAA 3102 

I I I I I I I I I I I I I II I I I I I I I I I II I I I II I I I II I I I I I I I I I I I I I I II 
Db 3036 T CT CAAAT AGAAAT GG G CAAC AT AGT T AAAC C CAAAGT ACT T AC GAAAGAAGC AGAGGAA 3095 

Qy 3103 AAAC T T C C T T CT GAC AC AGAGAAAGAGGAC AGAT C C CT GT C AGCT GT AT T GT C AG CAGAG 3162 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3096 AAAC T T C CT T CT GAT AC AGAGAAAGAG GAC AGAT C C CT GAC AGC T GT AT T GT C AG CAGAG 3155 

Qy 3163 C T G AGT AAAACT T C AGT T GT T GAC CT C C T CT ACT GGAGAGAC AT TAAGAAGAC T G GAGT G 3222 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 3156 C T GAAT AAAACT T C AGT T GT T GAC C T C C T GT ACT GGAGAGACAT TAAGAAGAC T G GAGT G 3215 

Qy 3223 GTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTA 3282 

I I I I I I I I I I I I I I I II I I I I I I I I II I I II I I I II I I I I I I I I I I I II M I I I I I I I I I 
Db 3216 GTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTA 3275 

Qy 32 83 ACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGC 3342 

I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I II I I I I I I I I I I II I I I I I I I 
Db 327 6 ACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATATATAAGGGT 3335 

Qy 3343 GT GAT CCAGGCTAT C CAGAAAT CAGAT GAAGGC CACCCATT CAGGGCATAT TT AGAAT CT 3402 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 3336 GT GAT CCAAGCTAT C CAGAAAT CAGAT GAAGGC CACCCATT CAGGGCATAT TT GGAAT CT 3395 

Qy 34 03 GAAGT T GCT AT AT CAGAGGAATTGGTTC AGAAAT ACAGTAATTCTGCTCTTGGTC AT GTG 34 62 



1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 3396 GAAGT T GC C AT AT CAGAG GAATT G GT T C AGAAAT AT AGTAAT TCTGCTCTT GGT CAT GT G 3455 

Qy 34 63 AAC AGC ACAAT AAAAGAACT GAG GC G GC T T T T CT T AGT T GAT GAT T T AGT T GAT T C CCTG 3522 

I I I I I I I I I I I I I I I I II I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I M I I I I 
Db 3456 AAC AG CACAAT AAAAGAAT T GAGG C GT CT CT T CT T AGT T GAT GAT T TAGT T GAT T C C CT G 3515 

Qy 3523 AAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACA 3582 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II 
Db 3516 AAGTTTGCAGTGTTGATGTGGGTATTTACTTACGTTGGTGCCTTGTTCAATGGTTTGACA 3575 

Qy 3583 CT ACT GAT T T TAG CT C T GAT CT C ACT CTT C AGT AT T C CT GT TAT T TAT GAAC GGC AT CAG 3642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I 
Db 3576 CT AC T GAT T T T AG CT CT GAT CT C ACT C TT C AGT AT T C CT GT TAT AT AT GAAC G GC AT CAG 3635 

Qy 3643 GTGCAGATAGATCATTATCTAGGACTTGCAAACAAGAGTGTTAAGGATGCCATGGCCAAA 3702 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I M I I I I I I I I 
Db 3636 GCGCAGATAGATCATTATCTAGGACTTGCAAACAAGAGCGTTAAGGATGCCATGGCCAAA 3695 

Qy 3703 AT C CAAGCAAAAAT C C CT GGATT GAAG C G C AAAGCAGA 374 0 

I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I 
Db 3696 AT C CAAGCAAAAAT C C CT GGAT T GAAG C G C AAAGCAGA 3733 
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AY114152 3821 bp mRNA linear ROD 17-JUL-2002 

Mus musculus nogo-A mRNA, complete cds . 

AY114152 

AY114152 . 1 GI: 21898576 

Mus musculus (house mouse) 
Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

1 (bases 1 to 3821) 

Jin,W., Long,M. , Li,R. and Ju,G. 

Cloning and expression of the mouse Nogo-A protein 
Unpublished 

2 (bases 1 to 3821) 

Jin,W., Long,M., Li,R. and Ju,G. 

Direct Submission 

Submitted ( 17-MAY-2002 ) Institute of Neurosciences, 17 Chang Le Xi 
Road, Xi'an, Shaanxi 710032, China 

Location/ Qualifiers 

1. .3821 

/organism="Mus musculus" 
/mol_type="mRNA" 
/strain="BALB/c" 
/db_xr e f = " t axon : 1 0 0 9 0 " 
247. .3738 

/note-"neurite outgrowth inhibition; RTN4; foocen" 

/ codon_start=l 

/product="nogo-A M 

/protein_id="AAM77068 .1" 

/db_xref="GI: 21898577" 

/ trans la tion="MEDIDQSSLVSSSADSPPRPPPAFKYQFVTEPEDEEDEEDEEEE 
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QPSWERSPAASAPSLPPAAAVLPSKLPEDDEPPARPPAPAGASPLAEPAAPPSTPAAP 
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RANMESKVDKKCFEDSLEQKSHGKDSESRNENASFPSTPELVKDGSRAYITCDSFTSA 

TESTAANIFPVLEDHTSENKTDEKKIEERKAQIITEKTSPKTSNPFLVAIHDSEADYV 

TTDNLSKVTEAWATMPEGLTPDLVQEACESELNEATGTKIAYETKVDLVQTSEAIQE 

SIYPTAQLCPSFEEAEATPSPVLPDIVMEAPLNSLLPSTGASVAQPSASPLEVPSPVS 

YDGIKLEPENPPPYEEAMSVALKTSDAKEEIKEPESFNAAAQEAEAPYISIACDLIKE 

TKLSTEPSPGFSNYSEIAKFEKSVPDHCELVDDSSPESEPVDLFSDDSIPEVPQTQEE 

AVMLMKESLTEVSETVTQHKHKERLSASPQEVGKPYLESFQPNLHITKDAASNEIPTL 

TKKETISLQMEEFNTAIYSNDDLLSSKEDKMKESETFSDSSPIEIIDEFPTFVSAKDD 

SPKEYTDLEVSNKSEIANVQSGANSLPCSELPCDLSFKNTYPKDEAHVSDEFSKSRSS 

VSKVPLLLPNVSALESQIEMGNIVKPKVLTKEAEEKLPSDTEKEDRSLTAVLSAELNK 

TSWDLLYWRDIKKTGVVYFGASLFLLLSLTVFSIVSVTAYIALALLSVTISFRIYKG 

VIQAIQKSDEGHPFRAYLESEVAISEELVQKYSNSALGHVNSTIKELRRLFLVDDLVD 

SLKFAVLMWVFTYVGALFNGLTLLILAIjISLFSIPVIYERHQAQID^^ 

AMAKI QAKI P GLKRKAE " 



ORIGIN 



Query Match 85.5%; Score 3200.4; DB 10; Length 3821; 

Best Local Similarity 92.8%; Pred. No. 0; 

Matches 3488; Conservative 0; Mismatches 211; Indels 61; Gaps 10; 

Qy 2 TTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGA 61 

I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 15 TTGCTCATCTGGGCGGCGGCGGCTGCTGCAACTGAGGACAGGACGGGTGGCGCATCTCGA 74 

Qy 62 TCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTC 121 

I I I I I I I I I I II II I I I I I I I I I I I I 1 Mill I I I I I I I III I I II I II 
Db 75 GCGCGGAGGCAGGAGGAGAAGTCTTATTGTTCCTGGAGCTGTCGCCTTTGCGGGTTCCTC 134 

Qy 122 GGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAA 181 

I I I I II I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 135 GGCTTGG TTCGGCCAGCCCGGCCTCTGCCAGTCCTGCCCAACCCCCACAA 184 

Qy 182 CCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCG 241 

I M I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I III II 
Db 185 CCGCCCGCGGCTCTGAGGAGAAGTGGCCC-GCGGCGGCAGTAGCTGCAGCATCATCGCCG 243 

Qy 242 AC C C GC C AG C C AT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTCGTC CAC GGACAGC C 301 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 244 A C CAT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTCGTCC G CGGATAGC C 295 

Qy 302 CGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAGG 361 

I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I 
Db 2 96 CGCCCCGGCCCCCGCCCGCTTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAGG 355 

Qy 362 ACGAGGAGGAGGAGGAGGAC GAGGAGGAGGACGAC GAGGACCTAGAGGAACTGGAGGT GC 421 

I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 35 6 ACGAGGAAGACGAGGAG GAGGAG GAGGAC GAC GAGGAC CT G GAGGAAT T G GAGGT G C 412 

Qy 422 TGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGC 4 81 

I I I M I I I I I I I I I I I I II I I I I I I I M I II I I I I I I I I I I I I M I I M M I 
Db 413 TGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCGGTTCCGGT GCCCCCCGCCGCCGCAC 4 69 



Qy 482 CGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCG 541 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I t I I I I I I I I I I I I I I II I I I | I M I I I 

Db 470 CGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCG 529 

Qy 542 CGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGC 601 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I II I I I I I I MINI 

Db 530 CGCCCCCCACCGCCCCTGAGAGGCAGCCGTCCTGGGAACGCAGCCCCGCGGCGTCCGCGC 58 9 

Qy 602 CATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTC 661 

I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 590 CATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCGGAGGACGACGAGCCTC 64 9 

Qy 662 CGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGC 721 

I III I I I I I I I I I I I I I I II I I II I I I I I I I II I II I I I I II I I I I I I I I 

Db 650 CAGCG CGGCCTCCGGCGCCAGCCGGCGCGAGCCCCCTAGCGGAGCCCGCCGCGC 703 

Qy 722 CCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTT 781 

I I II II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 704 CCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCGGGCTCAGTGGATGAGACCCTTT 7 63 

Qy 7 82 TTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGATT 841 

I I I I I I II I I II I II I I I I I I I I II I I I I I I I II I I I I I I I I II II I I I I I I I I I I I I II 

Db 7 64 TTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGATT 823 

Qy 842 TGATGGAGCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGC 901 

III I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 824 T GAAGGAGC AGC CAGGT AAC AC T GT T T C GT C T G GT CAAGAGGAT T T C C CAT CT GT C CT GT 8 83 

Qy 902 TTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAAC 961 

I I I I I I II I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 884 TTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAG7\AC 943 

Qy 962 AT GGAT AC CT T GGTAAC T TAT CAG C AGT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT T 1021 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I II I I I I I I I I I 

Db 944 AC GGAT AC CT T GGTAAC T TAT C AGCAGT G GCAT C C ACAGAAGGAACT AT T GAAGAAAC T T 1003 

Qy 1022 TAAAT GAAGCTT CTAAAGAGTT GCCAGAGAGGGCAACAAAT CCATTT GTAAATAGAGATT 1081 

I I I I I I I I I I I I I I I III I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 1004 TAAAT GAAGCT T CT AGAGAAT T G C CAGAGAG GG CAACAAAT C CAT T T GT AAAT AGAGAGT 1063 

Qy 1082 T AGC AGAAT T T T CAGAAT T AGAAT AT T C AGAAAT GG GAT CAT C TT T T AAAG G CT C C C CAA 1141 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I 

Db 1064 CAG C AGAGT T T T CAGT AT T AGAAT ACT C AGAAAT GGGAT C AT CTT T CAAT GGCT C C C CAA 1123 

Qy 1142 AAGGAGAGT CAGCCAT ATTAGT AGAAAACACTAAGGAAGAAGTAATT GT GAGGAGTAAAG 1201 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I 

Db 112 4 AAGGAGAGT CAG C CAT GT T AGT AGAAAAC AC T AAGGAAGAAGT AAT T GT GAGGAGTAAAG 1183 

Qy 1202 ACAAAGAGGATTTAGTTTGTAGTGCAGCCCTTCACAGTCCACAAGAATCACCT 1254 

I I I I I I I I I I I I I I II I I I I I I I I I I i I I II I I I I I M I I I I I I MINI 

Db 1184 ACAAAGAGGATTTAGTTTGTAGTGCAGCCCTTCATAATCCACAAGAGTCACCTGCGACCC 1243 

Qy 1255 GT GG GTAAAGAAGAC AGAGT T GT GT CT C C AGAAAAGACAAT GGAC AT T T T T A 1306 

I I I I M I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1244 TTACTAAAGT GGTTAAAGAAGAC GGAGTT AT GT CT C CAGAAAAGACAAT GGAC AT TTT TA 1303 



1307 AT GAAAT GCAGAT GT CAGTAGTAGCACCT GT GAGGGAAGAGT AT GCAGACTTTAAGCCAT 1366 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1304 AT GAAAT GAAAAT GT CAGT G GT AGC AC CT GT GAG GGAAGAGT AT G C AGAT T TT AAGC C AT 1363 

1367 TT GAACAAGCAT GGGAAGT GAAAGATACTTAT GAGGGAAGT AGGGAT GT GCT GGCT GCTA 1426 

I I I I M I I I I I II I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1364 TT GAACAAGCAT GGGAAGT GAAAGATACT T AT GAGG GAAGT AG G GAT GT GCT GGCT GC T A 1423 

1427 GAGCT AAT GT GGAAAGT AAAGT G GACAGAAAAT G C T TG GAAGAT AGC C T G GAGCAAAAAA 14 86 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

1424 GAG CT AAT AT G GAAAGT AAAGT G GACAAAAAAT GCT TT GAAGAT AGC CT GGAGCAAAAAA 1483 

1487 GT C T T G GGAAG GAT AGT GAAGGC AGAAAT GAG GAT GCTTCTTTCCC CAGT AC C C C AGAAC 1546 

III I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

14 84 GT CAT GGGAAGGAT AGT GAAAG C AGAAAT GAGAAT GCTTCTTTCCC CAGT AC C C C AGAAC 1543 

1547 CTGTGAAGGACAGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCTCAGCAACCGAAA 1606 

I I I I I I I I I I I I I I II I I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

154 4 TT GT GAAG GAC GG C T C C AGAG C GT ACAT CAC CT GT GAT T C C T T T AC C T C AGCAAC C GAGA 1603 

1607 GCACCACAGCAAACACTT T CC CTTT GTTAGAAGATCAT ACTT CAGAAAATAAAACAGAT G 1666 

I II I I I I M I I I I I I I I I I II MINIMI! II I I I I II I I I I I I M I I I I I I 

1604 GT ACT G C AGCAAAC AT T T T C C C T GT GCT AGAAGAT CAC AC T T CAGAAAATAAAACAGAT G 1663 

1667 AAAAAAAAAT AGAAGAAAGGAAG GC C CAAAT T AT AAC AGAGAAGACT AGC C C CAAAAC GT 1726 

I I I I I I I II I I I I I II I I I I I I I I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I M 

1664 AAAAAAAAAT AGAAGAAAGGAAG GC C CAAAT T AT AAC AGAGAAGACT AGC C C CAAAAC GT 1723 

1727 CAAAT CCTTTCCTT GT AG CAGT AC AG GAT T C T GAG GCAGAT TAT GT T AC AAC AGAT AC CT 17 8 6 

I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

1724 CAAAT CCTTTCCTT GT AG C AAT AC AT GAT T C C GAG GCAGAT TAT GT CAC AAC AGAT AAT T 17 83 

1787 TAT CAAAGGT GACT GAGG C AG CAGT GT C AAAC AT GC CT GAAGGT CT GAC G C C AGAT T TAG 18 46 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I II I I I I I II I I I I I I I I I I I I I 

1784 TAT CAAAG GT GAC T GAGG CAGT AGT GGC AAC CAT GC CT GAAGGT CT AAC GC C AGAT T TAG 1843 

1847 T T C AGGAAGCAT GT GAAAGT GAACT GAAT GAAGC C AC AGGT ACAAAGAT T GCT TAT GAAA 1906 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I 

1844 T T C AGGAAG CAT GT GAAAGT GAACT GAAC GAAGC CACAG GT ACAAAGAT T GCT TAT GAAA 1903 

1907 C AAAAGT GGACTT GGT C C AAAC AT C AGAAGCT AT ACAAGAAT CAC T T T AC C C CAC AGC AC 1966 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I III I I I I I I I I I II I I I I 

1904 CAAAAGT GGACTT GGT CCAGACAT CAGAAGCTAT ACAAGAGT CAATTTACCCCACAGCAC 1963 

1967 AGCTTTGCCCATCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGCCTGATATTG 2026 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M 
1964 AGC T T T GC C CAT CAT T T GAG GAAGCT GAAGCAACT C C GT CAC C AGT T T T G C CT GAT AT T G 2023 

2027 TTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCA 2086 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I II 

2 024 TTATGGAAGCTCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCA 2083 

2 087 GTGTATCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGCCTG 2146 

III I I I I I 11 II I II I III I I I I I I I I II I I I I I I I II I I I I I I I I I I II I I I I 

2 08 4 GT G CAT C C C C ACT AGAAGT AC C GT CT C CAGT T AGT TAT GAC G GT AT AAAG CT T GAG C CT G 2143 

2147 AAAAC C C C C CAC CAT AT GAAGAAGC CAT GAAT GT AGC ACTAAAAG CT T T G GGAACAAAG G 2206 



Db 2144 AAAAT C C C C C AC CAT AT GAAGAAG C CAT GAGT GT AGC ACT AAAAAC AT C G GAC GCAAAGG 2203 

Qy 2207 AAGGAATAAAAGAGC CT GAAAGT T T T AAT GC AGC TGT T C AG GAAACAGAAGCT C CT T AT A 2266 

III III I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 2204 AAGAAAT T AAAGAG C C T GAAAGT T T T AAT GC AGC T GCT CAG GAAG C AGAAG CT C C TT AT A 2263 

Qy 2267 TAT C CAT T GC GT GT GAT T TAAT TAAAGAAAC AAAGC TCT C C ACT GAGC CAAGT C C AGAT T 2326 

I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 22 64 TAT C CAT T GC AT GT GAT T TAAT T AAAGAAACAAAG CT CT C C AC T GAG C CAAGT C C AG GGT 2323 

Qy 2327 T CT C TAAT TAT T C AGAAAT AG C AAAAT T C GAGAAGT CGGT G C C C GAAC AC GCT GAGC TAG 238 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II II III I I I I I I I 
Db 2 324 TCT CTAAT TAT T C AGAAAT AG CAAAAT T T GAGAAGT CGGT AC C T GAT C ACT GT GAGCT C G 2383 

Qy 2387 T GGAGGAT T C CT C AC CT GAAT CT GAAC C AGT T GAC T TAT T T AGT GAT GAT T C GATT C CT G 244 6 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I M 
Db 2384 TGGAT GATT CCTCACCCGAATCT GAAC CAGTTGACTTATTTAGT GAT GATT CGATTCCTG 24 43 

Qy 2447 AAGT C C C ACAAAC ACAAGAGGAGG CT GT GAT GCT CAT GAAG GAGAGT CT C ACT GAAGT GT 2506 

I I I I I I I I I I I I I I II I I I I I II I I I II I I I I I I M I II I I I I I I I I I I I I I I I I I II I 
Db 2444 AAGT CCCACAAACACAAGAGGAGGCT GT GAT GCTAATGAAGGAGAGT CT CACT GAAGT GT 2503 

Qy 2507 CT GAGACAGTAGC CCAGCACAAA GAG GAG AGAC T T AGT G C C T C AC C T CAG GAG C TAG 2563 

I I I I I I I I I I I I II Mill! II I I I I I I I II I I I I I I I I I I I I I I I I I I M 

Db 25 04 CT GAGACAGTAACACAAC ACAAAC AT AAGGAGAGACT T AGT GCT T C AC CT C AGGAGGT AG 2563 

Qy 2564 GAAAGC CAT AT TT AGAGT CT TT T CAG C C CAAT T T ACAT AGT ACAAAAGAT G CT G CAT C T A 2623 

I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I M II I I I I I II I I I II I I I 
Db 2564 GAAAG C CAT AT TT AGAGT CT TT T C AGC C CAAT T T ACAT AT T ACAAAAGAT GCT GC AT CT A 2623 

Qy 2624 AT GACAT T C CAACATTGACCAAAAAGGAGAAAATTT CTTT GCAAAT GGAAGAGTTTAAT A 2683 

I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I II II II I I I I I I I 
Db 2624 AT GAAATT CCAACATT GAC CAAAAAGGAGACAATTT CTTT GCAAAT GGAAGAGT TTAATA 2683 

Qy 2684 CT GCAAT T TAT T CAAAT GAT GACT T AC T T T CT T CTAAG GAAGACAAAAT AAAAGAAAGT G 2743 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I! 
Db 2684 C T GCAAT T TAT T C CAAT GAT GACT TACT T T C T T C TAAGGAAGACAAAAT GAAAGAAAGT G 2743 

Qy 2744 AAAC AT T TT C AGAT T CAT C T C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT TT GT CAGT G 2803 

I I I I I I I I I I I II I I I I II I I I I I I I II I I II I I I I I I I I I I I I II I I I I I I I I II 
Db 2744 AAAC AT T T T C C GAT T CAT C T C C CAT T GAG AT AAT AGAT GAGT T T C C CAC AT T T GT CAGT G 2803 

Qy 2 8 04 CTAAAGAT GATT C T C CT AAATT AGC CAAG GAGT ACACT GAT C T AGAAGT AT C C GACAAAA 28 63 

I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 28 04 CTAAAGAT GATT CT CCT AAG GAGT AC ACT GAC CT AGAAGT AT CCAACAAAA 2854 

Qy 28 64 GTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTG 2923 

I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 28 55 GTGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTG 2914 

Qy 2924 AC CTTTCTTT CAAGAAT AT AT AT C CTAAAGAT GAAGTACAT GT T T C AGAT GAAT T CT C C G 2983 

I M I I I 1 I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 2915 ACCTTT CTT T CAAGAAT AC AT AT C CTAAAGAT GAAGCACAT GT CT CAGAT GAATT CTCCA 2974 

Qy 2984 AAAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCTTTGGAAC 3043 

Ml I I I M I I I I I I I I I I I I M I II III II I I I I I I I I I I I I I I I I I I I I 



Db 



2 975 AAAGTAGGTCCAGTGTATCTAAGGTGCCCTTATTGCTTCCAAATGTTTCTGCTTTGGAAT 3034 



Qy 304 4 C T C AGAC AGAAAT G GGC AG CAT AGT T AAAT C C AAAT CACT T AC GAAAGAAGC AGAGAAAA 3103 

II I I I I I I I I I I I I I I MILIUM I I I I I I I || | | | | | | | | | M I I I I I II 

Db 3035 C T C AAAT AGAAAT GGG CAAC AT AGT T AAAC C CAAAGT AC T T AC GAAAGAAGC AGAGGAAA 3094 

Qy 3104 AAC T T C CT T CT GAC AC AGAGAAAGAG GAC AGAT CC CT GT C AGCT GT AT T GT C AGC AGAGC 3163 

I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I I II M I I I I I I I I M I I I I I I I I I I I 

Db 3095 AACTT CCTT CTGATACAGAGAAAGAGGACAGAT CC CT GACAGCT GTATT GT CAGCAGAGC 3154 

Qy 3164 T GAGT AAAACTT CAGT T GT T GAC CT C C T C TACT GGAGAGAC AT T AAGAAGAC T GGAGT GG 3223 

III I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 3155 T GAAT AAAACTT CAGT T GT T GAC C T C CT GT AC T GGAGAGAC AT TAAGAAGACT GGAGT GG 3214 

Qy 3224 TG TTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTG 3280 

II I I I I I I I I II I I I I I I I I I I I M I I M II I I I I I I I I I I I I I II II I II I I I I II 

Db 3215 TGTATTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTG 3274 

Qy 32 81 TAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGG 334 0 

I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I M I M I I I I I II I I I I I I 
Db 3275 TAACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATATATAAGG 3334 

Qy 3341 GCGT GAT CC AG GCT AT CC AGAAAT CAGAT GAAGGCCAC C CATT CAGGGCATATTTAGAAT 34 00 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M M I I I I I II I I I I I I I I I II I I I I I 
Db 3335 GT GT GAT C CAAG CT AT C CAGAAAT CAGAT GAAGGC CAC C CATT CAG GGC AT AT TT GGAAT 3394 

Qy 3401 CT GAAGTT GCT ATATCAGAGGAATT GGT T C AGAAAT AC AGT AATT CT GCT CTT GGT CAT G 3460 

I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3395 CT GAAGT T GC C AT AT C AGAGGAAT T GGT T CAGAAAT AT AGTAATT CT GCTCTTGGT CAT G 3454 

Qy 34 61 T GAAC AGC ACAAT AAAAGAACT GAG GCGGCTTTT CT T AGT T GAT GATT T AGT T GAT T C C C 3520 

I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I II I I I I I I I I I 
Db 34 55 T GAAC AGC ACAAT AAAAGAAT T GAGGC GT CT CT T CTT AGT T GAT GAT T T AGT T GAT T C C C 3514 

Qy 3521 TGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGA 3580 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II 
Db 3515 TGAAGTTTGCAGTGTTGATGTGGGTATTTACTTACGTTGGTGCCTTGTTCAATGGTTTGA 3574 

Qy 3581 CACT ACT GAT T T T AGCT CT GAT C T CAC T C T T CAGT AT T C CT GT T ATTT AT GAAC GGC AT C 3640 

I I I I I I I I I I M I I I I I I I I I I I I I M M I II II I I I M I I I I I I I I I I I I I I I I I I I I 
Db 3575 CACT ACT GAT T T T AGCT CT GAT C T CAC T CT T CAGT AT T C CT GT TAT AT AT GAAC GGC AT C 3634 

Qy 3641 AGGT G CAGAT AGAT CATT AT CT AG GACT T GC AAAC AAGAGT GT T AAGGAT GC CAT GGC C A 3700 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I II I II I M I I I I I I I I I I 

Db 3635 AGGC G CAGAT AGAT CATT AT CT AG GACT T GC AAACAAGAGC GT TAAGGAT G C CAT GGC CA 3694 

Qy 3701 AAAT C CAAG CAAAAAT CC C T GGAT T GAAGC G CAAAGC AGA 3740 

I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I II 
Db 3695 AAAT C CAAG CAAAAAT C C CT GGAT T GAAGC GCAAAGC AGA 3734 



RESULT 5 
BC056373 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 



BC056373 4518 bp mRNA linear ROD 08-OCT-2003 

Mus musculus cDNA clone MGC: 73436 IMAGE: 6847916, complete cds . 
BC056373 

BC056373.1 GI:33604147 



KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



TITLE 

JOURNAL 
MEDLINE 
PUBMED 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REMARK 
COMMENT 



MGC. 

Mus mus cuius (house mouse) 
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FEATURES Location/Qualifiers 
source 1. .4518 

/organism="Mus musculus" 



/mol_type="mRNA n 

/strain="C57BL/6" 

/db_xref="taxon: 10090" 

/ cl one= "MGC : 7 3 4 3 6 IMAGE :6847916" 

/tissue_type="Brain, mouse, 13.5,14.5,16.5,17.5 dpc" 
/clone_lib="NIH_BMAP_FY0" 
/lab_host="DH10B" 
/note="Vector: pYX-ASC" 
CDS 120. .2282 

/codon_start=l 

/product="Unknown (protein for MGC:73436)" 
/protein_id="AAH56373 .1" 
/db_xref="GI: 33604148" 

/translation "MEDIDQSSLVSSSADSPPRPPPAFKYQFVTEPEDEEDEEDEEEE 
EDDEDLEELEVLERKPAAGLSAAPVPPAAAPLLDFSSDSVPPAPRGPLPAAPPTAPER 
QPSWERSPAASAPSLPPAAAVLPSKLPEDDEPPARPPAPAGASPLAEPAAPPSTPAAP 
KRRGSGSVDETLFALPAASEPVIPSSAEKIMDLKEQPGNTVSSGQEDFPSVLFETAAS 
LPSLSPLSTVSFKEHGYLGNLSAVASTEGTIEETLNEASRELPERATNPFVNRESAEF 
SVLEYSEMGSSFNGSPKGESAMLVENTKEEVIVRSKDKEDLVCSAALHNPQESPATLT 
KWKEDGVMS PEKTMDI FNEMKMSVVAPVREEYADFKPFEQAWEVKDTYEGSRDVLAA 
RANMESKVDKKCFEDSLEQKGHGKDSESRNENASFPRTPELVKDGSRAYITCDSFSSA 
TESTAANIFPVLEDHTSENKTDEKKIEERKAQIITEKTSPKTSNPFLVAIHDSEADYV 
TTDNLSKVTEAWATMPEGLTPDLVQEACESELNEATGTKIAYETKVDLVQTSEAIQE 
SIYPTAQLCPSFEEAEATPSPVLPDIVMEAPLNSLLPSTGASVAQPSASPLEVPSPVS 
YDGIKLEPENPPPYEEAMSVALKTSDSKEEIKEPESFNAAAQEAEAPYISIACDLIKE 
TKLSTEPSPEFSNYSEIAKFEKSVPDHCELVDDSSPES" 

ORIGIN 

Query Match 83.9%; Score 3140.4; DB 10; Length 4518; 

Best Local Similarity 93.3%; Pred. No. 0; 

Matches 3386; Conservative 0; Mismatches 196; Indels 48; Gaps 8; 

Qy 12 9 CTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCG 188 

I 1 I I I I I I I I I I I III MM I I II M I II II I II II II I II II II I II II I I 

Db 5 CTCGGCTTGGTTCGGCCAGCCCGGCCTCTGCCAGTCCTGCCCAACCCCCACAACCGCCCG 64 

Qy 18 9 CGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCC 248 

II I I I I II I II II I I II I II I I II II I I II II I II II I I I II III II I 

Db 65 CGGCTCTGAGGAGAAGTGGCCC-GCGGCGGCAGTAGCTGCAGCATCATCGCCGA 117 

Qy 24 9 AG C CAT G GAAGACAT AGAC CAGT CGTCGCTGGTCTCCTC GT C C AC GGACAGC CC G C C C C G 308 

I II II I II I II I I II I I I II II I I I II M I I I II I I I I I II II II II I II II I! I I 
Db 118 — CCATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCGCGGATAGCCCGCCCCG 175 

Qy 309 GCCTCCGCCCGCCTT CAAGT AC CAGT T C GT GAC GGAGCC C GAGGAC GAG GAGGAC GAGG A 368 

Ml I I I II II I M II II I II M I II II II I II I I II I II I I I II I II II I II II II M 
Db 17 6 GCCCCCGCCCGCTTT CAAGT AC CAGT T C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAGGA 235 

Qy 369 GGAG GAG GAG GAC GAGGAGGAGGAC GAC GAGGAC CTAGAG GAACT GGAGGT GCT G GAGAG 428 

M II I I It I M I I II II II II I II I I I I I II II I I II I I I I II I II I I II I I I 
Db 236 AGACGAGGAG GAGGAGGAGGAC GACGAGGACCTGGAGGAATT GGAGGT GCT GGAGAG 292 

Qy 429 GAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCT 4 88 

II M M I M 1 I M II II II M I II III I I I I I I I I I II II I II I I II II I I I I 
Db 2 93 GAAGCCCGCAGCCGGGCTGTCCGCGGCTCCGGT GCCCCCCGCCGCCGCACCGCTGCT 34 9 



QY 



4 89 GGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCC 54 8 



Db 350 GGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCC 409 

Qy 549 TGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCT 608 

I I I I I I I I I I I I I I I I I I II I 1 I I I I I I I I I I I I I I I I M I | I I I I I I I I I I I I I 
Db 410 CACCGCCCCTGAGAGGCAGCCGTCCTGGGAACGCAGCCCCGCGGCGTCCGCGCCATCCCT 4 69 

Qy 609 GCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAG 668 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I I I I I I I I I | Ml 
Db 470 GCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCGGAGGACGACGAGCCTCCAGCG — 527 

Qy 669 GCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTC 728 

I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I || i I I I I I I I I I I II I 
Db 528 CGGCCTCCGGCGCCAGCCGGCGCGAGCCCCCTAGCGGAGCCCGCCGCGCCCCCTTC 583 

Qy 729 CACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCT 7 88 

I II I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 584 CACGCCGGCCGCGCCCAAGCGCAGGGGCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCT 643 

Qy 7 89 T C C T G C T GC AT CT GAGC C T GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGAT TT GAT GGA 84 8 

I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II M III 
Db 644 T C CT GC T GC AT C T GAG C CT GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGAT T T GAAGGA 703 

Qy 849 GCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAAC 908 

I I I I I M I M I I I I I I M I I I I I I I II I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I 
Db 704 GCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGTTTGAAAC 763 

Qy 909 TGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATA 968 

I I I I I I I I I I I I I M M I I I II I I I I I I I I I I I I I I II I I I I I II I I I I I I I I | | | | I I 
Db 764 TGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACACGGATA 823 

Qy 969 C C T T GGTAACT TAT CAGC AGT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT T TAAAT GA 1028 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 824 CC T T G GTAACT TAT CAG C AGT GGC AT C C AC AGAAG GAAC T ATT GAAGAAAC T T TAAAT GA 883 

Qy 1029 AGC T T C T AAAGAGT T G C C AGAGAGGGCAAC AAAT C CAT T T GT AAAT AGAGAT T TAG C AGA 1088 

I M I I I I I Ml I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 8 84 AG CT T CT AGAGAAT T GC C AGAGAG G GCAACAAAT C CAT T T GTAAAT AGAGAGT CAGC AGA 943 

Qy 108 9 AT TT T CAGAATTAGAAT ATTCAGAAAT GGGAT CAT CTTTTAAAGGCT C CCCAAAAGGAGA 1148 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I II I I I I I I I I || I I I II I I 
Db 944 GT TT T CAGT AT TAGAAT ACT CAGAAAT GG GAT CAT CT T T CAAT G GC T C C C CAAAAGGAGA 1003 

Qy 1149 GT CAGCCAT ATTAGTAGAAAACACTAAGGAAGAAGTAATT GT GAGGAGTAAAGACAAAGA 1208 

I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 1004 GT CAGCCAT GTTAGTAGAAAACACTAAGGAAGAAGTAATT GT GAGGAGTAAAGACAAAGA 1063 

Qy 1209 GGATTTAGTTTGTAGTGCAGCCCTTCACAGTCCACAAGAATCACCT 1254 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1064 G GAT T T AGT T T GT AGT G C AGCC CT T C AT AAT C C ACAAGAGT CAC CT GCGAC C CT T ACT AA 1123 

Qy 1255 - GTGGGTAAAGAAGACAGAGTTGT GT CT CCAGAAAAGACAATGGACATTTTTAAT GAAAT 1313 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 1124 AGTGGTTAAAGAAGAC GGAGTTAT GT CT C CAGAAAAGACAAT GGACATTTT TAAT GAAAT 1183 

Qy 1314 GC AGAT GT CAGT AGT AGC AC CT GT GAG G GAAGAGT AT G CAGACT T T AAG C CAT T T GAAC A 1373 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I 



Db 1184 GAAAAT GT CAGT GGT AG C AC CT GT GAG G GAAGAGT AT GC AGAT T T T AAGC CAT T T GAAC A 1243 

Qy 1374 AGC AT GGGAAGT GAAAGAT ACT TAT GAGGGAAGT AG G GAT GTGCTGGCT GC T AGAGCT AA 1433 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I | I I | I M I I I I I 
Db 1244 AG CAT GG GAAGT GAAAGAT ACT TAT GAGG GAAGT AGGGAT GTGCTGGCT GCT AGAGCTAA 1303 

Qy 1434 T GT GGAAAGT AAAGT G GAC AGAAAAT G C T T GGAAGAT AGC C T G GAG CAAAAAAGT CT T GG 1493 

I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I IN M I 
Db 1304 TAT GGAAAGTAAAGT GGACAAAAAATGCTTT GAAGATAGC CT GGAGCAAAAAGGT CAT GG 1363 

Qy 1494 GAAG GATAGT GAAG GC AGAAAT GAGGAT GCTTCTTTC C CC AGT AC C C CAGAAC CT GT GAA 1553 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i II I I I I I I I I | I MINI 
Db 1364 GAAGGATAGT GAAAGCAGAAAT GAGAAT GCTT CT TT CC CCAGGAC C CCAGAACTT GT GAA 1423 

Qy 1554 GGACAG C T C C AGAGC AT AT AT T AC CT GT GCTTCCTT T AC CT CAGCAAC C GAAAG C AC C AC 1613 

I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I 
Db 1424 GGACGGCTCCAGAGCGTACATCACCTGTGATTCCTTTAGCTCAGCAACCGAGAGTACTGC 1483 

Qy 1614 AGCAAACACTTTCCCTTTGTTAGAAGAT CATACTT CAGAAAATAAAACAGATGAAAAAAA 1673 

I I I I I I I I I I I I II I II I I I I I I I I I I II I I II I I I I I I I I I I I II I I I I I I I I I 
Db 14 8 4 AG CAAACAT T T T C C CT GT GCT AGAAGAT CAC ACT T C AGAAAACAAAACAGAT GAAAAAAA 154 3 

Qy 1674 AAT AGAAGAAAGGAAGGC C CAAAT T AT AAC AGAGAAGACT AG C CC CAAAAC GT CAAAT C C 1733 

I I I M I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1544 AAT AGAAGAAAGGAAGGCCCAAATT ATAACAGAGAAGACT AGC CCCAAAAC GT CAAATC C 1603 

Qy 1734 T T T C CT T GT AGC AGT AC AGGAT T CT GAGGC AGAT TAT GTT ACAACAGAT AC CT T AT CAAA 1793 

M I I I I I I I I I II I I I I I I I II I I I II I M I I I I | I | | | || | | | || | I I I I | I I I 
Db 1604 T T T C CT T GT AGCAAT ACAT GAT T C T GAG GC AGAT TAT GT C ACAACAGAT AAT T TAT CAAA 1663 

Qy 1794 GGT GAC T GAGGC AG CAGT GT CAAACAT GC CT GAAGGT C T GAC GC CAGATT T AGT T C AGGA 1853 

I I I I I I I I I I I I I I I I I I III I I I II I I I I I I I I I I I I I I I I | M | | | | | | | | | | | 
Db 1664 G GT GACT GAGGC AGT AGT GGCAACC AT GCCT GAAGGT CTAACGCCAGATTTAGTTC AGGA 1723 

Qy 1854 AGC ATGT GAAAGT GAACT GAAT GAAGC C AC AGGT ACAAAGAT T GC T TAT GAAACAAAAGT 1913 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I | I | | | | | | | | | | | 
Db 1724 AGC AT GT GAAAGT GAACT GAAC GAAG C CAC AGGT ACAAAGAT T GCT TAT GAAACAAAAGT 1783 

Qy 1914 G GACT T G GT CCAAAC AT CAGAAGC T AT ACAAGAAT CACT T T AC C C C AC AGCAC AGC T T T G 1973 

I M I I II I I I I I I I I I I I I I I I I I I I I I I I II Ml I I I I I I I I I I I I I I I I I I I I I I 
Db 1784 G GACT T G GT CCAGAC AT C AGAAGCT AT ACAAGAGT CAAT T T AC C C C ACAG C AC AGCT T T G 1843 

Qy 1974 C C CAT CAT T T GAG GAAGC T GAAGCAACT C C GT CAC CAGT T TT G C CT GAT AT T GTT AT GGA 2033 

I I M I I I I I I M M I I I I I II I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I M I I I I 
Db 1844 C C CAT CAT T T GAG GAAG CT GAAGCAACT C C GT CAC CAGT T TT GCCT GAT AT T GTT AT GGA 1903 

Qy 2034 AGCAC CATTAAATTCTCTCCTTCCAAGCGCT GGT GCTT CTGT AGT GCAGCC CAGT GT AT C 2093 

Ml I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I II I I I I I I I I I II I I III 
Db 1904 AGCGCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAGTGCATC 1963 

Qy 2094 C C CAC T GGAAG CAC CT C CT C CAGT T AGTT AT GAC AGT AT AAAGC T T GAGC CT GAAAAC C C 2153 

I I I I I I I I I I Ml I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I II 
Db 1964 C C CAC T AGAAGT AC C GT CT C C AGT T AGTTAT GAC GGT AT AAAG CT T GAG C C T GAAAAT C C 2023 

Qy 2154 C C CAC CAT AT GAAGAAGC CAT GAAT GT AGC AC T AAAAGC T TT GG GAACAAAGGAAG GAAT 2213 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I III 
Db 2 024 C C CAC CAT AT GAAGAAGC CAT GAGT GT AGC ACT AAAAACAT C GGACT C AAAG GAAGAAAT 2083 



Qy 2214 AAAAGAGC CT GAAAGT T TTAAT GC AG CT GT T C AGGAAAC AGAAGCT C CT TAT AT AT C CAT 2273 

I M I I I I I I I I I I I II I II I I I I I I I | I | | | | I I I | | I I I I I I | | 

Db 2084 T AAAGAGC CT GAAAGT T T TAAT GCAG CT GCT C AGGAAG CAGAAGCT C C T TAT AT AT C CAT 214 3 

Qy 2274 T G C GT GT GAT T TAAT T AAAGAAACAAAG CT C T C C ACT GAGC CAAGT C C AGAT TT CT C T AA 2333 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II 
Db 2144 T GC AT GT GAT T TAAT T AAAGAAACAAAGCT C T C CACT GAG C CAAGT C C AGAGT T CT CTAA 22 03 

Qy 2334 T TAT T CAGAAAT AGC AAAAT T C GAGAAGT C G GT GC C C GAAC AC GC T GAG CT AGT GGAGGA 2393 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || Ml M I I I I I II I I II 
Db 2204 T TAT T CAGAAAT AG CAAAAT T T GAGAAGT C G GT GC CT GAT CACT GT GAGCT C GT GGAT GA 2263 

Qy 2394 TTCCTCACCTGAATCTGAACCAGTTGACTTATTTAGTGATGATTCGATTCCTGAAGTCCC 24 53 

I I M M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I M II I II I I II I I I I I 
Db 2264 TTCCTCACCCGAATCTTAACCAGTTGACTTATTTAGTGATGATTCAATTCCTGAAGTCCC 2323 

Qy 2454 ACAAACACAAGAGGAGGCT GTGAT GCT CAT GAAGGAGAGT CTCACTGAAGT GTCT GAGAC 2513 

M M I I II I I I I I I I I I I I I I I I I I | I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2324 ACAAACACAAGAGGAGGCT GT GAT G CT AAT GAAGGAGAGT CT CACT GAAGT GT CT GAGAC 2383 

Qy 2514 AGT AGC C C AG C ACAAA GAGGAGAGACTT AGT GCCT CAC CTCAGGAGCTAGGAAAGCC 2570 

I I I I I M I I I I I I I I II || I I I I I || | | | | || | | | | | M | | | | M I I I I I I 

Db 2384 AGTAACACAAC ACAAAC ATAAG GAGAGACT T AGT GCT T C AC CT C AGGAGGT AGGAAAGC C 244 3 

Qy 2571 AT AT T TAGAGT CT T T T C AGC CCAAT T TACAT AGT ACAAAAGAT GCT G CAT C TAAT GAC AT 2 63 0 

I I I I M I I I I I I I II I I II I I I I I I I | | | | | | M I I I I I I I I I I I I I I I I I I I I M M 
Db 244 4 AT AT T T AGAGT CT T T TC AGC C CAAT T TACAT AT T ACAAAAGAT GCT GC AT C TAAT GAAAT 2503 

Qy 2631 T C CAACAT T GAC CAAAAAG GAGAAAAT T T CT T T GCAAAT GGAAGAGT TTAAT ACT G CAAT 2 690 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | || | | | M I II I I I I I I I I 
Db 2504 T C CAACAT T GAC CAAAAAGGAGACAAT T T CT T T GCAAAT GGAAGAGTT TAAT ACT GCAAT 2563 

Qy 2691 TTATT CAAAT GAT GACTT ACTTT CTT CTAAGGAAGACAAAATAAAAGAAAGTGAAACATT 2750 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I M | 
Db 2564 T TAT T C CAAT GAT GACT T AC T T T CT T CTAAGGAAGACAAAAT GAAAGAAAGT GAAAC AT T 2623 

Qy 2751 TT CAGATT CAT CT C C GAT T G AGAT AAT AGAT GAATTTCCCACGTTT GT CAGT GCTAAAGA 2810 

Ml II I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I II I I I I I I || I I 
Db 2624 T T C C GAT T CAT C T C C CATT GAGAT AAT AGAT G AGT T T CC C ACAT T T GT CAGT G C TAAAGA 2683 

Qy 2811 TGATTCTCCTAAATTAGCCAAGGAGTACACTGATCTAGAAGTATCCGACAAAAGTGAAAT 2 87 0 

I I I I II I I I I I I I I I I I I II I I M I I I I I I I I I I I I I I I I M I I I I I I I 

Db 2684 TGATTCTCCT AAGGAGTACACT GAC CTAGAAGTAT CCAACAAAAGT GAAAT 2734 

Qy 2871 T G CTAATAT C CAAAGC G G GG CAGATT CAT TGC CTT GCT TAGAATTGCCCTGT GAC CTTTC 2930 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I 
Db 2735 TGCTAATGTCCAGAGCGGGGCCAATTCGT TGC CTT GCT CAGAATTGCCCTGT GAC CTTTC 2794 

Qy 2931 T T T CAAGAAT AT AT AT C CT AAAGAT GAAGT AC AT GT T T CAGAT GAAT T CT C C GAAAAT AG 2990 

I I I I II I I I I I I I I I I I I I I I I I || I || I I | | | | I I I | M I I I I M I I I III III 
Db 27 95 T T T CAAGAAT AC AT AT C CTAAAGAT GAAG C ACAT GT C T CAGAT GAAT T CT C CAAAAGT AG 2 8 54 

Qy 2 991 GT C CAGT GT AT CT AAGG CAT C CAT AT C GC CT T CAAAT GT C T C T GCTT T G GAAC C T CAGAC 3050 

I I M I I M M I M M I I I I I I I I I I I I I I I I I I II II I I I 

Db 2855 GTCCAGTGTATCTAAGGTGCCCTTATTGCTTCCAAATGTTTCTGCTTTGGAATCTCAAAT 2 914 



Qy 3051 AGAAAT GGGCAGCATAGT TAAAT CCAAAT CACTTACGAAAGAAGCAGAGAAAAAACTT CC 3110 

I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I || I I I 
Db 2 915 AGAAAT GG GCAACAT AGT TAAAC C CAAAGT AC T T ACGAAAGAAGC AGAG GAAAAACTT C C 2 974 

Qy 3111 T T C T GAC AC AG AGAAAGAGG AC AGAT C C CT GT C AG CT GT AT T GT C AGC AGAG CT GAGT AA 317 0 

I I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I II II I I I I I I I I I I 1 I I I I II Ml 
Db 2975 T T CT GAT ACAGAGAAAGAG GAC AGAT C C C T GAC AGCT GT AT T GT CAG C AGAGC T GAATAA 3034 

Qy 3171 AACT T C AGTT GT T GAC C T C C T C TACT G GAG AGACATT AAGAAGAC T GG AGT G GT GT T T GG 3230 

I I I I I I I II I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II | I I I 
Db 3035 AACTTCAGTTGTTGACCTCCTGTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGG 3094 

Qy 3231 TGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTA 3290 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 3095 TGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTA 3154 

Qy 32 91 CAT T G C CT T G GC CCTGCTCTCGGT GAC TAT C AG CT T T AGGAT AT ATAAG G G C GT GAT C C A 3350 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II 
Db 3155 CATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATATATAAGGGTGTGATCCA 3214 

Qy 3351 GG CT AT C C AGAAAT CAGAT GAAGGC CAC C CAT T C AGGGC ATAT TT AGAAT CT GAAGTT G C 3 410 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II II I I I I I I I 
Db 3215 AGCT AT C C AGAAAT CAGAT GAAGGC CAC C CAT T C AGGGC AT AT TT GGAAT CT GAAGT T GC 3274 

Qy 3411 TAT AT C AGAGGAAT T GGT T C AGAAAT AC AGT AAT TCTGCTCTTGGT CAT GT GAAC AGC AC 3470 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I || I I I I I I I 
Db 3275 CAT AT C AGAGGAAT T GGT T C AGAAAT AT AGTAAT T CT GC T C TT GGT C AT GT GAACAGC AC 3334 

Qy 3471 AATAAAAGAACTGAGGCGGCTTTTCTTAGTT GAT GATTT AGTT GATTCCCTGAAGTTTGC 353 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I 
Db 3335 AAT AAAAGAAT T GAGGC GT C T CT T C T T AGT T GAT GAT T T AGT T GAT T C C CT GAAGT T T G C 3394 

Qy 3531 AGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGAT 3590 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I || I I I I M I I I I I I I I II I I I I I I II I 
Db 3395 AGT GT T GAT GT G GGT ATT T AC T T AC GTTGGTGCCTTGTT CAAT GGTT T GACACT AC T GAT 3454 

Qy 3591 T T TAG C T CT GAT CT C ACT CT T C AGT ATT C C T GT TAT T TAT GAACG GC AT CAG GT GCAGAT 3650 

I I II I I I I I I I I I I I I I I I I I I II I II II I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 3455 T T T AGC T C T GAT CT C ACT C TT C AGT ATT C CT GT TAT AT AT GAAC G G CAT CAG GC GCAGAT 3514 

Qy 3651 AGAT CAT TAT CT AGGACT T GCAAACAAGAGT GT T AAGGAT G C CAT GGC CAAAAT C CAAGC 3710 

I II I LI I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I II I I I I I I 
Db 3515 AGAT CAT TAT CT AGGACT T GCAAAC AAGAGC GT TAAGGAT GC CAT GGC CAAAAT C CAAGC 357 4 

Qy 3711 AAAAATCCCTGGATTGAAGCGCAAAGCAGA 37 4 0 

I M I I I II I I I I I I I II I I I I II I I I I II I 
Db 3575 AAAAAT C C CT G GAT T GAAGC GCAAAG C AGA 3604 
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AY102280 4063 bp mRNA linear ROD 29-JAN-2003 

Mus musculus RTN4 (Rtn4) mRNA, complete cds , alternatively spliced. 
AY102280 

AY102280. 1 GI: 23379808 
Mus musculus (house mouse) 



ORGANISM Mus musculus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 
REFERENCE 1 (bases 1 to 4063) 

AUTHORS Oertle,T., Huber,C, van der Putten,H. and Schwab, M.E. 

TITLE Genomic Structure and Functional Characterisation of the Promoters 

of Human and Mouse nogo/rtn4 
JOURNAL J. Mol. Biol. 325 (2), 299-323 (2003) 
MEDLINE 22376540 
PUBMED 12488097 
REFERENCE 2 (bases 1 to 4 063) 

AUTHORS Oertle,T. and Schwab,M.E. 
TITLE Direct Submission 

JOURNAL Submitted ( 07-MAY-2002 ) Brain Research Institute, University of 
Zurich and ETH Zurich, Winterthurerstr . 190, Zuerich 8057, 
Switzerland 
REFERENCE 3 (bases 1 to 4063) 

AUTHORS Van der Putten,H. 

TITLE Direct Submission 

JOURNAL Submitted ( 07-MAY-2002 ) Nervous System Research, Novartis Pharma 
Inc., Basel, Switzerland 
FEATURES Location/Qualifiers 
source 1. .4063 

/organism="Mus musculus" 

/mol_t ype= "mRNA" 

/strain="129/SvcJ7" 

/db_xref="taxon: 10090" 

/ chromosome="ll" 
gene 1. .4063 

/gene="Rtn4" 

/note="synonym: nogo" 
5 1 UTR 1. .32 

/gene="Rtn4" 

/evidence=experimental 
CDS 33. .3173 

/gene="Rtn4" 

/note="RTN4-D; alternatively spliced" 
/ codon_start=l 
/product="RTN4" 
/protein_id="AAM73502 . 1" 
/db_xref="GI : 23379809" 

/ trans lation="MAPPLAGGGQKGGAASEAWVPSLFVGVSGSTCTAAKSLVPI PAR 
SSRLSAARNETLFALPAASEPVIPSSAEKIMDLKEQPGNTVSSGQEDFPSVLFETAAS 
LPSLSPLSTVSFKEHGYLGNLSAVASTEGTIEETLNEASRELPERATNPFVNRESAEF 
SVLEYS EMGS S FNGS PKGESAMLVENTKEEVI VRS KDKEDLVCSAALHNPQES PATLT 
KVVKEDGVMSPEKTMDIFNEMKMSVVAPVREEYADFKPFEQA 

RANMESKVDKKCFEDSLEQKGHGKDSESRNENASFPRTPELVKDGSRAYITCDSFSSA 
TESTAANIFPVLEDHTSENKTDEKKIEERKAQIITEKTSPKTSNPFLVAIHDSEADYV 
TTDNLSKWEAWATMPEGLTPDLVQEACESELNEATGTKIAYETKVDLVQTSEAIQE 
SIYPTAQLCPSFEEAEATPSPVLPDIVMEAPLNSLLPSTGASVAQPSASPLEVPSPVS 
YDGIKLEPENPPPYEEAMSVALKTSDSKEEIKEPESFNAAAQEAEAPYISIACDLIKE 
TKLSTEPSPEFSNYSEIAKFEKSVPDHCELVDDSSPESEPVDLFSDDSIPEVPQTQEE 
AVMLMKESLTEVSETVTQHKHKERLSASPQEVGKPYLESFQPNLHITKDAASNEIPTL 
TKKETISLQMEEFNTAIYSNDDLLSSKEDKMKESETFSDSSPIEIIDEFPTFVSAKDD 
SPKEYTDLEVSNKSEIANVQSGANSLPCSELPCDLSFKNTYPKDEAHVSDEFSKSRSS 
VSKVPLLLPNVSALESQIEMGNIVKPKVLTKEAEEKLPSDTEKEDRSLTAVLSAELNK 
TSWDLLYWRDIKKTGWFGASLFLLLSLTVFSIVSVTAYIALALLSVTISFRIYKGV 



IQAIQKSDEGHPFRAYLESEVAISEELVQKYSNSALGHWSTIKELRRLFLVDDLVDS 
LKFAVLMWVFTYVGALFNGLTLLILALISLFSIPVIYERHQAQIDHYLGLANKSVKDA 
MAKIQAKI PGLKRKAE " 
3'UTR 3174. .4063 

/gene="Rtn4" 

ORIGIN 

Query Match 70.9%; Score 2651; DB 10; Length 4063; 

Best Local Similarity 93.7%; Pred. No. 0; 

Matches 2 804; Conservative 0; Mismatches 160; Inclels 27; Gaps 3; 

Qy 768 GGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGA 827 

I I I I I I I I I I 1 I I I I II I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I 
Db 188 GAATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGA 247 

Qy 828 AAAAAT TAT GG AT T T GAT G GAGC AG C C AGGT AACACT GT T T C GT CT GGT CAAGAGGAT TT 887 

I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 24 8 AAAAAT TAT GGAT T T GAAGGAGC AGC C AGGT AAC AC T GT T T C GT CT GGT CAAGAGGAT T T 307 

Qy 888 CCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGT 947 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I I I I I I I 
Db 308 CCCATCTGTCCTGTTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGT 367 

Qy 948 T T CT T T T AAAGAAC AT GGAT AC C T T GGT AACT T AT CAG CAGT GT CAT CCT C AGAAG GAAC 1007 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I M I I I I I I I I 
Db 368 T T CT T T T AAAGAAC AC GGAT AC CT T G GT AAC T TAT CAG CAGT GGC AT CCAC AGAAG GAAC 427 

Qy 1008 AATT GAAGAAACTTTAAAT GAAGCTT CTAAAGAGTT GCCAGAGAGGGCAACAAAT CCATT 1067 

I I I I I I I I I II I I I I I I I I I I I I I I I I I III I I I I II I I I I II I I I I I I I I I I I I I I 

Db 428 T ATT GAAGAAACT T T AAAT GAAGCT T C T AGAGAAT T GC CAGAGAGGG C AACAAAT C CAT T 487 

Qy 1068 T GTAAAT AGAGAT T T AGC AGAAT T T T C AGAAT T AGAAT AT T C AGAAAT G GGAT CAT CT T T 1127 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 488 T GTAAAT AGAGAGT CAG C AGAGT T T T CAGT AT T AGAAT ACT CAGAAAT G GGAT CAT CT T T 547 

Qy 112 8 T AAAGG CT C C C CAAAAGGAGAGT CAGC CAT AT T AGTAGAAAAC ACTAAGGAAGAAGT AAT 1187 

II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 54 8 CAAT GG CT C C C CAAAAG GAGAGT CAGC CAT GTT AGTAGAAAAC ACTAAGGAAGAAGT AAT 607 

Qy 118 8 T GT GAGGAGT AAAGACAAAGAGGAT T T AGT T T GT AGT GCAGC C CT T CACAGT C CACAAGA 1247 

I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 608 T GT G AG GAGT AAAGACAAAGAGGAT T T AGT T T GT AGT GCAGC C CT T CAT AAT C CACAAGA 667 

Qy 124 8 ATCACCT GT GGGT AAAGAAGACAGAGT T GT GT CT C C AGAAAAGAC 12 92 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 668 GT CAC CT GC GAC CCT T ACTAAAGT GGT TAAAGAAGACGGAGT TAT GT CT C C AGAAAAGAC 727 

Qy 12 93 AATGGACAT T T T T AAT GAAAT GC AGAT GT CAGT AGT AG CAC CT GT GAGGGAAGAGTAT GC 1352 

I I I I II I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I II II I I I I I M I I I I I I 
Db 728 AAT GG ACAT T T TT AAT GAAAT GAAAAT GT CAGT GGTAGCAC CT GT GAGGGAAGAGTAT GC 7 87 

Qy 1353 AGAC T T T AAG C CAT T T GAACAAG CAT G GGAAGT GAAAGAT AC T TAT GAG GGAAGT AGG GA 1412 

III I I I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 788 AGAT T T TAAG C CAT T T GAACAAG CAT GGGAAGT GAAAGAT AC T TAT GAG GGAAGT AGG GA 847 



Qy 



1413 TGTGCTGGCT GCT AGAGC T AAT GT G GAAAGT AAAGT GGACAGAAAAT GC T T G GAAGAT AG 1472 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 



Db 84 8 T GT G CT GGCT GCT AGAG C TAAT AT G GAAAGT AAAGT G GACAAAAAAT G CT T T GAAGAT AG 907 

Qy 1473 CCT G GAG CAAAAAAGT CT T G GGAAGGAT AGT GAAGGC AGAAAT GAGGAT GCTTCTTTCCC 1532 

I I I I I I I II I I I I III I I I II I I I I I I I II I I I I I I I I I I I II I I I I I I I I I II I I 
Db 908 CCT GGAGC AAAAAGGT C AT GGGAAG GAT AGT GAAAGCAGAAAT GAGAAT GCTTCTTTCCC 967 

Qy 1533 CAGT AC C C C AGAAC CT GT GAAG GAC AG CT C C AGAGCAT AT AT T AC CTGTGCTTCCTT T AC 1592 

III I I I I I I I I I I I II I I I I I I I I I II I I I I I I II II I I I I I I I I I I I I I I I 
Db 968 C AGGAC C C C AGAACT T GT GAAGGAC G GCT C C AGAG C GT AC AT C AC CT GT GAT T C CT T TAG 1027 

Qy 1593 CT C AG CAAC C GAAAGC AC C AC AGCAAAC AC TTTCCCTTT GT T AGAAGAT C AT ACTT C AG A 1652 

I I I I I I I I I I I I II II I I I I I II II I II I I II II II I I I I I I I I I I II I I I I 
Db 102 8 CT C AG C AAC CGAGAGT AC T GCAG CAAAC AT TTTCCCTGTGC T AGAAGAT C AC ACTT C AGA 1087 

Qy 1653 AAAT AAAAC AGAT GAAAAAAAAAT AGAAGAAAGGAAGGC C CAAAT T ATAACAGAGAAGAC 1712 

III I II I I I I I I II I I I I I I I I I I I I I II I I II I M I M I I I I I I I I I I I I I I I I I I I I 
Db 108 8 AAAC AAAAC AG AT GAAAAAAAAAT AGAAGAAAGGAAG G C C CAAAT T AT AAC AGAGAAGAC 1147 

Qy 1713 T AGC C C CAAAAC GT CAAAT CCTTTCCTT GT AGC AGT AC AGGAT T CT GAGG CAGATTAT GT 1772 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II MM I I I I II I I I I I I I I II I I I I 
Db 114 8 T AGC C C CAAAAC GT CAAAT C C T T T C CT T GT AGCAATAC AT GAT T CT GAG GC AGATT AT GT 12 07 

Qy 1773 T AC AAC AGAT AC CT T AT CAAAGGT GACT GAG GCAG CAGT GT CAAAC AT GC CT GAAG GT CT 18 32 

I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I II III I I I I I I I I I I I II II 
Db 12 08 CACAACAGATAATTTATCAAAGGTGACTGAGGCAGTAGTGGCAACCATGCCTGAAGGTCT 1267 

Qy 1833 GACGC C AG AT T T AGT T CAGGAAGCAT GT GAAAGT GAAC T GAAT GAAGC C AC AGGT ACAAA 18 92 

I I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 12 68 AAC G C C AGAT T TAGT T CAGGAAGCAT GT GAAAGT GAACT GAAC GAAGC CACAG GT ACAAA 1327 

Qy 18 93 GATT G CT TAT GAAAC AAAAGT GGAC T T GGT C CAAACAT CAGAAGC T AT ACAAGAAT C AC T 1952 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I II I I I III I 
Db 132 8 GATTGCTTATGAAACAAAAGTGGACTT GGTCCAGACAT CAGAAGCTATACAAGAGTCAAT 13 87 

Qy 1953 T T AC C C CACAG CACAG C T T T GC C CAT CAT TT GAGGAAGCT GAAGCAACT C C GT C AC CAGT 2 012 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 138 8 T T AC C C CACAG C ACAGCT T T GC C CAT CAT TT GAGGAAGCT GAAGCAACT C C GT CAC CAGT 14 47 

Qy 2013 TTTGCCTGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTC 2072 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 1448 TTTGCCTGATATTGTTATGGAAGCGCCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTC 1507 

Qy 207 3 T GT AGT GC AGC CC AGT GT AT C C C CAC T GGAAG CAC CT C CT C CAGT TAGT TAT GAC AGT AT 2132 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I II I I 
Db 150 8 TGTAGCGCAGCCCAGTGCATCCCCACTAGAAGTACCGTCTCCAGTTAGTTATGACGGTAT 1567 

Qy 2133 AAAGCT T GAGC CT GAAAAC C CC C CAC CAT AT GAAGAAGC C AT GAAT GT AGC ACTAAAAGC 2192 

I I I I I II I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 1568 AAAGCT T GAG C CT GAAAAT C CC C CAC CAT AT GAAGAAG CC AT GAGT GT AG C ACT AAAAAC 1627 

Qy 2193 T T T G G GAACAAAGGAAGGAAT AAAAGAGC CT GAAAGT T TT AAT G C AGCT GT T CAGGAAAC 2252 

I M I I II I I I I I III I I II I I I I I I II I I I I I I I II I II I I I I I I I I I I I I 
Db 1628 AT CGGACT CAAAGGAAGAAATTAAAGAGC CT GAAAGTTTTAAT GCAGCT GCT CAGGAAGC 1687 

Qy 2253 AGAAGCT C CTTATATAT C C ATT GCGT GT GATT TAATT AAAGAAACAAAGCT CT C C ACT GA 2312 

I I I I I I I I I I I I I I I I I II I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I 
Db 1688 AGAAG CT C CT T AT AT AT C CAT T GCAT GT GAT T TAAT TAAAGAAACAAAG CT C T C C ACT GA 1747 



Qy 2313 G C C AAG T C C AG AT T T C T C T AAT TAT T C AG AAAT AG C AAAAT T C G AG AAG TCGGTGCCC G A 2372 

I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I II I I I I I I I I I I I I I I I I I M I I II 
Db 1748 GC CAAGT C CAGAGT T CT C T AAT TAT T CAGAAATAGC AAAAT T T GAGAAGT C GGT GC CT GA 1807 

Qy 2 37 3 ACAC GCT GAGCT AGT G GAG GAT T C C T C AC CT GAAT CT GAAC C AGT T G ACT T ATT T AGT GA 24 32 

III I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 
Db 1808 T CAC T GT GAGCT C GT G GAT GAT T C CT CAC C C GAAT CT GAAC C AGT T GAC T T ATT T AGT GA 18 67 

Qy 2 433 T GAT T C GAT T C CT GAAGT C C C ACAAAC ACAAGAGGAG G CT GT GAT GCT CAT GAAG GAGAG 2492 

I I I I I I I I i I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I II II I I I I 
Db 1868 T GAT T CAAT T C CT GAAGT C C CACAAACACAAGAG GAGGC T GT GAT GCT AAT GAAGGAG AG 1927 

Qy 24 93 T CT C ACT GAAGT GTCTGAGAC AGT AGCCCAGCACAAA GAG GAGAGACT T AGT G C C T C 254 9 

I I I I I II I I I I I I I I II I I I I I I I I I II I I I I I I I I I M I I I I I I I I I I I II 

Db 192 8 T CT C ACT GAAGT GT CT GAGAC AGT AAC ACAACACAAAC AT AAG GAGAGACT T AGT GC T T C 1987 

Qy 2 550 AC CT CAG GAGCT AGGAAAGC CAT AT T T AGAGT C T T T T C AGC C CAAT T T AC AT AGT AC AAA 260 9 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I II I 
Db 1988 AC CT CAG GAGGT AGGAAAGC CAT AT TT AGAGT CT T T T CAG C C CAATT T AC AT AT T AC AAA 2047 

Qy 2 610 AGAT GC T G CAT C T AAT GACAT T C CAAC AT T GAC CAAAAAG GAGAAAAT T T C T T T GCAAAT 2669 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I 
Db 2 04 8 AGATGCTGCATCTAATGAAATTCCAACATTGACCAAAAAGGAGACAATTTCTTTGCAAAT 2107 

Qy 2 67 0 GGAAGAGT T T AAT AC T GCAAT T TAT T CAAAT GAT GACT T ACT T T C TT C T AAGGAAGACAA 272 9 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I I I I I I I II I M I I I I 
Db 2108 GGAAGAGT T T AAT AC T GCAAT T TAT T C CAAT GAT GACT T ACT T T CTT C T AAGGAAGACAA 2167 

Qy 2730 AATAAAAGAAAGT GAAACATTT T CAGATT CAT CT CCGATT GAGAT AAT AGAT GAATTT C C 278 9 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
Db 2168 AAT GAAAGAAAGT GAAACATTTT CC GATT CAT CT CCCATT GAGAT AAT AGAT GAGTTT CC 2227 

Qy 2790 CACGTTT GT CAGT GCTAAAGAT GATT CTCCTAAATT AGC CAAGGAGT ACACT GAT CT AGA 2849 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I II I I I I I I I I I I I I I 

Db 2228 CAC AT T T GT CAGT GCTAAAGAT GAT T CT C CT AAGGAGT ACACT GAC CT AGA 2278 

Qy 2 850 AGTATCCGACAAAAGTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTT 2909 

I II I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 2279 AGTATCCAACAAAAGTGAAATTGCTAATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTC 2338 

Qy 2 910 AGAAT T GC C CT GT GAC CT TT CT T T C AAGAAT AT AT AT C CT AAAGAT GAAGT AC AT GTT T C 2969 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II 
Db 2 339 AGAAT T GC C CT GT GAC CT TT CT T T CAAGAAT ACAT AT C CTAAAGAT GAAGC AC AT GT CT C 2398 

Qy 2 97 0 AGAT GAAT T CT C C GAAAAT AGGT C C AGT GT AT CT AAGGCAT C CAT AT C GC CT T CAAAT GT 3029 

I I I I I i I I M I I I III I M I I I I I I I I M I I I II I I II III II I I I I I I I I 
Db 2399 AGATGAATTCTCCAAAAGTAGGTCCAGTGTATCTAAGGTGCCCTTATTGCTTCCAAATGT 24 58 

Qy 3030 CTCTGCTTT GGAAC C T C AGACAGAAAT GGGC AGCAT AGT T AAAT C CAAAT C ACT T AC GAA 3089 

II I I I I I I II I I I I I I I II I I I II I I I I I I I I I II I I I I I I I I I I I I I II II 
Db 2459 TTCTGCTTT GGAAT C T CAAAT AGAAAT GGGCAAC ATAGT T AAAC C CAAAGT ACT T AC GAA 2518 

Qy 3090 AGAAG CAGAGAAAAAACT T C CT T CT GAC ACAGAGAAAGAG GACAGAT C C C T GT C AGCT GT 314 9 

I I M I II I II I I I I I I I I I II I I II I I I I I II I I I I I II I II I I I I I I I I I I I II I I 
Db 2519 AGAAG CAGAGGAAAAACT T C C T T CT GAT ACAGAGAAAGAG GACAGAT C C C T GAC AGCT GT 2578 



Qy 3150 AT T GT C AG C AGAGC T GAGT AAAAC T T C AGT T GT T GAC CT C CT CT AC T G GAG AGACAT T AA 320 9 

I I I I I I II I I I I I I I I I t I II I I I I I I f II I I I I I I I I I I I I I II I I I I I II I I I I I I 
Db 2579 AT T GT CAGCAGAGCT GAAT AAAACT T C AGT T GT T GAC CT C CT GT ACT GGAGAGAC ATT AA 2638 

Qy 3210 GAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAG 3269 

I II II II II I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I 
Db 2 63 9 GAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAG 2698 

Qy 32 70 CATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAG 332 9 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I 
Db 2699 CATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAG 2758 

Qy 3330 GAT AT AT AAGGG CGT GAT C CAGGCT AT C C AGAAAT CAGAT GAAG GC C AC C CAT T CAGGGC 338 9 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II M I II I I I I II I I I I I I I I I I I I 
Db 27 5 9 GAT AT ATAAGG GT GT GAT C C AAG C TAT C CAGAAAT CAGAT GAAGG C CAC C CAT T CAGGGC 2 818 

Qy 3390 AT AT T T AGAAT CT GAAGT T G CT AT AT C AGAG GAAT T G GT T CAGAAAT AC AG TAAT T C T G C 34 4 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2819 AT AT T T G GAAT C T GAAGT T GC C AT AT C AGAGGAAT T GGT T CAGAAAT AT AGT AAT T CT GC 287 8 

Qy 34 50 TCTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTT 3509 

I I I I I I I I I I I I I I I I I II I I I I II I I I I I I II I I I I I II I I I I M I I I I I I I I I I I 
Db 2879 T C T T G GT C AT GT GAAC AG C ACAATAAAAGAAT T GAGGC GT CT C TT CT T AGT T GAT GAT T T 2938 

Qy 3510 AGTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTT 3569 

M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I II I I I I I I I I I I I I I I I 
Db 2 939 AGTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTATTTACTTACGTTGGTGCCTTGTT 2998 

Qy 3570 C AAT G GT CT GAC ACT ACT GAT T T T AGC T CT GAT CT CAC T CT T C AGT AT T C C T GTT AT T T A 3629 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II 
Db 2999 CAAT G GT T T GAC ACT ACT GAT T T T AGC T CT GAT CT CACT CT T C AGT ATT C CT GT TAT AT A 3058 

Qy 3630 T GAAC G GC AT CAGGT GC AGAT AGAT CAT TAT CT AGGACT T G CAAACAAGAGT GT T AAG GA 3689 

I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 3059 TGAACGGCATCAGGCGCAGATAGATCATTATCTAGGACTTGCAAACAAGAGCGTTAAGGA 3118 

Qy 3690 T G C CAT GGC CAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC GCAAAGCAGA 37 4 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 1 II I I I I I I I 
Db 3119 T G C CAT GGC CAAAAT C CAAGCAAAAAT C C CT G GAT T GAAGC GCAAAGCAGA 3169 
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This clone was selected for full length sequencing because it 
passed the following selection criteria: Hexamer frequency ORF 
analysis . 

FEATURES Location/Qualifiers 
source 1. .3815 

/organism="Mus musculus" 
/mol_type="mRNA" 



/db_xref="taxon: 10090" 
/clone="IMAGE: 5366860" 

/tissue_type =,, Eye, retina, mouse strain C57B1\6" 

/clone_lib="NIH_MGC_94" 

/lab_host="DH10B" 

/note="Vector : pCMV-SPORT6" 

ORIGIN 

Query Match 68.0%; Score 2543.6; DB 10; Length 3815; 

Best Local Similarity 93.5%; Pred. No. 0; 

Matches 2696; Conservative 0; Mismatches 159; Indels 27; Gaps 3; 

Qy 877 CAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCT 936 

I I I I I I I I I I I 11 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 10 CGAGAGGATTTCCCATCTGTCCTGTTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCT 69 

Qy 937 C T C T C AACT GT T T CT T T TAAAGAAC AT GGAT AC C T T G GT AACT T AT C AGC AGT GT CAT C C 996 

II I I I I I I I I I I I I I I I I II I I I I I I I I I I I M II II I II I I I I I I I I I II I I I I I I I 

Db 70 CTCTCAACTGTTTCTTTTAAAGAACACGGATACCTTGGTAACTTATCAGCAGTGGCATCC 12 9 

Qy 997 T CAGAAGGAACAAT T GAAGAAACT T T AAAT GAAG CT T CT AAAGAGT T GC C AGAGAGGGC A 1056 

II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II III I I I I II I I I I I I I I I 
Db 130 ACAGAAGGAACTATTGAAGAAACTTTAAATGAAGCTTCTAGAGAATTGCCAGAGAGGGCA 18 9 

Qy 1057 ACAAAT C CATTT GTAAAT AGAGATTT AGCAGAATTTT CAGAATTAGAATATT CAGAAAT G 1116 

I I II II I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I 
Db 190 ACAAAT C CATTT GTAAAT AGAGAGT CAGCAGAGTTTT CAGT ATTAGAAT ACTCAGAAAT G 249 

Qy 1117 G GAT CAT CTT T TAAAG GCT C C C CAAAAGGAGAGT CAGC C AT AT T AGT AGAAAAC ACT AAG 117 6 

I I I I I I I I I I I II I I I I I I I I I I II I I I I I II t I I I I I I I I I I II I I I I I I I I I I M 
Db 250 GGAT CAT CT T T CAAT G GC T C C C CAAAAGGAGAGT CAGC CAT GT T AGT AGAAAAC ACT AAG 309 

Qy 1177 GAAGAAGTAAT T GT GAGGAGT AAAGACAAAGAGGAT T T AGT TT GT AGT GC AGC CCT T CAC 1236 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I 
Db 310 GAAGAAGTAAT T GT GAG GAGT AAAGACAAAGAG GAT T T AGT TT GT AGT GC AGC CCT T CAT 369 

Qy 1237 AGT C CACAAGAAT C AC CT GTGGGTAAAGAAGACAGAGTTGTGTCT 1281 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I Mill 

Db 37 0 AAT C CACAAGAGT C AC CT GC GAC C CT T AC TAAAGT GGT T AAAGAAG AC GGAGTT AT GT CT 42 9 

Qy 1282 CCAGAAAAGACAATGGACATTTTTAATGAAATGCAGATGTCAGTAGTAGCACCTGTGAGG 1341 

I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I i I I I I I I I I I I I I I I I I I 
Db 430 CCAGAAAAGACAAT GGACATTT TTAAT GAAAT GAAAAT GT CAGT GGT AGCACCT GT GAGG 4 89 

Qy 1342 GAAGAGT AT GC AGACTT T AAGC CAT T T GAACAAG CAT GG GAAGT GAAAGATAC T TAT GAG 14 01 

I I I I I I I I I 1 I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I 
Db 490 GAAGAGT AT GCAGAT TT T AAGC CAT TT GAACAAGCAT G G GAAGT GAAAGATAC T TAT GAG 54 9 

Qy 1402 GGAAGTAGGGAT GT GCT GGCT GCT AGAGCTAAT GT GGAAAGTAAAGT GGACAGAAAAT GC 14 61 

I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 550 GGAAGTAGGGAT GTGCTGGCT GCT AGAGC T AAT AT G GAAAGTAAAGT GGACAAAAAAT GC 609 

Qy 1462 T T GGAAGAT AGC CT G GAG C AAAAAAGT CTT GGGAAG GAT AGT GAAG GCAGAAAT GAGGAT 1521 

II I I I I I I I I I I I I I I I I I I II I III I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 610 T T T GAAGAT AGC CT GGAG C AAAAAG GT CAT GGGAAGGAT AGT GAAAGC AGAAAT GAGAAT 669 



QY 



1522 GCTTCTTTCCCCAGTACCCCAGAACCTGTGAAGGACAGCTCCAGAGCATATATTACCTGT 15 81 



Db 670 GCTTCTTTCCC C AGGAC C C C AGAAC T T GT GAAG GAC G G C T C C AGAG C GT AC AT C AC CT GT 72 9 

Qy 1582 GCTTCCTT T AC CT CAG C AAC C GAAAGC AC C AC AG CAAACACT T T C C CT TT GT T AGAAGAT 1641 

I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I II I I I I I I I II I I I I I I I I 

Db 730 GAT T C C T T T AGC T C AGCAAC C GAGAGT ACT G CAG CAAAC AT TTTCCCTGT GCT AGAAGAT 78 9 

Qy 1642 CATACTT CAGAAAATAAAACAGAT GAAAAAAAAATAGAAGAAAGGAAGGCC CAAATT AT A 1701 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 

Db 790 CACACT T CAGAAAACAAAACAGAT GAAAAAAAAATAGAAGAAAGGAAGGCCCAAATTATA 84 9 

Qy 1702 AC AGAGAAGACT AGC C C CAAAAC GT CAAAT CCTTTCCTT GT AGC AGT ACAG GAT T CT GAG 1761 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 1 I I I 
Db 850 AC AGAGAAGAC T AGC C C CAAAAC GT CAAAT CCTTTCCTT GT AG CAAT ACAT GAT TCT GAG 909 

Qy 1762 GC AGAT TAT GT T ACAAC AGAT AC C T TAT C AAAG GT GACT GAGGC AGC AGT GT CAAACAT G 1821 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I III I I I I 
Db 910 GC AG AT TAT GT CACAACAGAT AAT TT AT CAAAG GT GACT GAGG C AGT AGT G G CAAC CAT G 969 

Qy 1822 CCT GAAG GT CT GAC GC C AGAT T T AGT T CAG GAAG CAT GT GAAAGT GAAC T GAAT GAAGC C 18 81 

I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 970 CCT GAAGGT CT AAC GC C AGAT TT AGT T C AGGAAGCAT GT GAAAGT GAACT GAAC GAAGC C 1029 

Qy 1882 ACAG GT ACAAAGAT T G CTT AT GAAACAAAAGT G GAC TT GGT C CAAACAT CAGAAGCT AT A 1941 

I I I I I I I I I I I I I I I I I I I I I I I I II I M I I I I I I I M I I II I I I I I I I I I I I I I I I I I 
Db 103 0 ACAGGTACAAAGATT GCTT AT GAAACAAAAGT GGACTTGGT CCAGACATCAGAAGCTATA 108 9 

Qy 1942 CAAGAAT C ACT T T AC C C CAC AGCAC AGCT T T GC C CAT CAT T T GAG GAAGCT GAAGCAACT 2 001 

I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I 
Db 1090 C AAGAGT CAAT TT AC C C C AC AGCACAG CT T T GC C CAT C AT TT GAGGAAGC T GAAGCAAC T 114 9 

Qy 2 002 C C GT CAC CAGT TT T GC CT GAT AT T GT TAT G GAAG CAC CAT T AAAT T CT CT C CT T CCAAGC 2061 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I II I I I I I M 
Db 1150 CCGTCACCAGTTTTGCCTGATATTGTTATGGAAGCGCCATTAAATTCTCTCCTTCCAAGC 12 09 

Qy 2062 GCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCTCCAGTTAGT 2121 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III I I I I I II I I I I 
Db 1210 ACT GGT GCTT CTGTAGCGCAGCCCAGTGCATCCCCACTAGAAGTACCGTCTCCAGTTAGT 12 69 

Qy 2122 TAT GAC AGT AT AAAG CTT GAGC CT GAAAAC C C C C CAC CAT AT GAAGAAGC CAT GAAT GT A 2181 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I MM 
Db 1270 TAT GAC GGT AT AAAG CTT GAGC CT GAAAAT C C C C CAC CAT AT GAAGAAGC CAT G AGTGT A 1329 

Qy 2182 GCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAGCCTGAAAGTTTTAATGCAGCT 2241 

I I I I I II I I I I II I II II I II I III I I I II II II I I I II II I M I I II II II 
Db 1330 G CACT AAAAAC AT C GGACT CAAAG GAAGAAAT T AAAGAG CCT GAAAGTT T T AAT GCAGCT 138 9 

Qy 2242 GT T CAG GAAAC AGAAGCT C CT T AT AT AT C CAT T GC GT GT GAT T T AAT TAAAGAAACAAAG 2301 

I II I I I I I II I I I II II I I II II I I II II II II I I I II I II I II II II I I I I I II I I 
Db 1390 GCT CAG GAAGC AGAAGCT C CTT AT AT AT C CAT T GCAT GT GAT T T AAT TAAAGAAACAAAG 1449 

Qy 2302 CT CT C CAC T GAGC CAAGT C C AGAT T T C T CT AAT TAT T C AGAAAT AGCAAAAT T C GAGAAG 2361 

I M I I I II I II II I I I || M I II I I I II II I I II II I I I I I II II I II II I I I I II I I 
Db 14 50 CT CT C CAC T GAGC CAAGT C C AGAGT TCT CT AAT TAT T C AGAAAT AGC AAAAT T T GAGAAG 1509 

Qy 2362 T C GGT GCCCGAACAC GCT GAGCTAGTGGAGGATT CCT CAC CT GAAT CT GAAC CAGTT GAC 2 421 

I M I II I I II III I II I I I II II I II I II II I I II II I II II I I M II I I I I I 



Db 1510 T C GGT GC CT GAT C ACT GT GAG C T C GT GGAT GAT T C CT C AC C C GAAT CT GAAC C AGT T GAC 1569 

Qy 2422 TT AT T T AGT GAT GAT T C GAT T C C T GAAGT C C C ACAAAC ACAAGAG GAG GC T GT GAT GCT C 2481 

I I I I I I I II II I I i I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I | | | | I | I I 
Db 1570 T TAT T T AGT GAT GAT T C AAT T C C T GAAGT C C CAC AAAC ACAAGAGGAGG CT GT GAT GCT A 1629 

Qy 2482 AT GAAGGAGAGTCTC ACT GAAGT GTCTGAGAC AGT AGCCCAGCACAAA GAGGAGAGA 2538 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I II II I I I I I I I I I I I I 

Db 1630 AT GAAG GAGAGT C T CAC T GAAGT GT C T GAGAC AGTAAC ACAAC AC AAAC AT AAGGAGAGA 168 9 

Qy 2539 CTTAGTGCCTCACCTCAGGAGCTAGGAAAGCCATATTTAGAGTCTTTTCAGCCCAATTTA 2598 

I I I I I I I I I I II I I I I I II I I II I I I I I I I I I M I I I I I I II I I I I II I I I M I I I I I 
Db 1690 C T T AGT GCT T CAC C T C AG GAGGT AG GAAAGC C AT ATT T AGAGT CT T T T C AGC C CAAT T T A 174 9 

Qy 2599 CATAGTACAAAAGATGCTGCATCTAATGACATTCCAACATTGACCAAAAAGGAGAAAATT 2658 

I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I 
Db 1750 CAT AT T AC AAAAG AT G C T G CAT C T AAT GAAAT T C C AAC AT T GAC C AAAAAG GAG AC AAT T 18 09 

Qy 2659 T CTT T G CAAAT GGAAGAGT T TAAT ACT GC AAT T TAT T CAAAT GAT GACT T ACTT T CTT CT 2718 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 1 I I M I I I I I I I I I I I I I I I I I I I 
Db 1810 T CTT T G CAAAT GGAAGAGT T TAAT ACT GCAAT T T ATT C CAAT GAT GACTT ACT T T CT T CT 1869 

Qy 2719 AAGGAAGACAAAATAAAAGAAAGT GAAACATTTT CAGATT CAT CT CC GATTGAGATAAT A 2778 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I II I 
Db 1870 AAGGAAGACAAAAT GAAAGAAAGT GAAACATTTT CCGATT CAT CT CCCATT GAGATAATA 1929 

Qy 2779 GAT GAAT T T C C CAC GT T T GT C AGT G CT AAAGAT GATT CT C CTAAATT AGC CAAG GAGT AC 2 838 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1930 GAT GAGTTT C C CAC AT T T GT CAGT G C T AAAGAT GAT T C T C C T AAGGAGTAC 1980 

Qy 2839 AC T GAT CTAGAAGTAT C C GAC AAAAGT GAAAT T GCT AAT AT C CAAAGCGG GGC AGATT C A 2 8 98 

I I I I I I I I I II I I I I I I I I I I I I I II I I I I I II II II I I I I I I I I I I I I I I I I 
Db 1981 ACT GAC CTAGAAGTAT C CAACAAAAGT GAAAT T GCT AAT GT C CAGAGCGGGGCCAATT C G 204 0 

Qy 2899 T T GC CT T GCT T AGAAT TGCCCTGT GAC CT TT C T T T CAAGAAT AT AT AT C CTAAAGAT GAA 2958 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2041 T T GC CT T GCT C AGAATT G C C CT GT GAC CT TT C T T T CAAGAAT AC AT AT C CTAAAGAT GAA 2100 

Qy 2959 GT AC AT GT T T CAGAT GAAT T CT C C GAAAAT AGGT C CAGT GTAT C T AAGGC AT C CAT AT C G 3018 

I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I.I I I I M I I II I I II Ml I 
Db 2101 G CAC AT GT CT CAGAT GAAT T CT C CAAAAGT AGGT C CAGT GTAT C T AAGGT GC C CT T AT T G 2160 

Qy 3019 C CT T CAAAT GTCTCTGCTTT GGAAC C T CAGACAGAAAT GGGC AGC AT AGT T AAAT C CAAA 3078 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I 
Db 2161 C T T C CAAAT GTTTCTGCTTT GGAAT C T CAAAT AGAAAT GG GCAAC AT AGT T AAAC C CAAA 2220 

Qy 3079 T CAC T T AC GAAAGAAGC AGAGAAAAAAC TTCCTTCT GAC ACAGAGAAAGAG GAC AGAT C C 3138 

I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2221 GTACT T ACGAAAGAAGCAGAGGAAAAACTTCCT T CT GATACAGAGAAAGAGGACAGAT CC 2280 

Qy 3139 CTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTCTACTGG 3198 

III I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I II I I I I I I 
Db 2281 CTGACAGCTGTATTGTCAGCAGAGCTGAATAAAACTTCAGTTGTTGACCTCCTGTACTGG 2340 

Qy 3199 AGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTG 3258 

I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I 
Db 2341 AGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTG 24 00 



Qy 3259 . ACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACT 3318 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I II I I 
Db 2401 ACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACT 24 60 

Qy 3319 AT CAG C T T TAG GAT AT ATAAGG G C GT GAT C CAGG CT AT C CAGAAAT C AGAT GAAGGC C AC 3378 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 
Db 24 61 AT CAG C T T TAG GAT AT AT AAG G GT GT GAT C C AAGCT AT C CAGAAAT C AGAT GAAGGC C AC 252 0 

Qy 3379 C CAT T CAG G GCAT AT T T AGAAT CT GAAGT T G CT ATAT C AGAG GAAT T GGT T CAGAAAT AC 3438 

I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2521 C CAT T CAG G GCAT AT T T GGAAT C T GAAGT T G C CAT AT C AGAG GAAT T GGT T CAGAAAT AT 2580 

Qy 3439 AGTAATTCTGCTCTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTA 3498 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I 
Db 2581 AGTAAT TCTGCTCTT GGT CAT GT GAAC AGCACAAT AAAAGAAT T GAG G C GT CT C TT CT T A 264 0 

Qy 3499 GTT GAT GATTTAGTTGATTCCCTGAAGTTTGCAGTGTT GAT GTGGGTGTT TACT TAT GTT 3558 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I III 

Db 2 641 GTT GAT GAT T T AGT T GAT T C C C T GAAGT T T GC AGT GTT GAT GT G GGT AT T T ACT T AC GT T 27 00 

Qy 3559 GGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATT 3618 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I 

Db 2701 GGTGCCTTGTTCAATGGTTTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATT 2760 

Qy 3619 C C T GT TAT T TAT GAAC GG CAT C AGGT GCAGAT AGAT CATT AT C T AGGACT T G CAAAC AAG 3678 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I i I I I I 
Db 27 61 C C T GT TAT AT AT GAAC G G CAT CAG GC GCAGAT AGAT CATT AT CT AGGACT T G CAAAC AAG 2820 

Qy 3 679 AGT GT T AAGGAT G C CAT G G C CAAAAT C C AAG C AAAAAT C C C T GGAT T GAAGC GC AAAG C A 37 38 

II I M I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 821 AG CGT T AAGGAT GC CAT G G C CAAAAT C CAAG C AAAAAT C C C T GGAT T GAAGC GCAAAG C A 28 80 

Qy 3739 GA 3740 

I I 

Db 2881 GA 2882 
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AB040462 4166 bp mRNA linear PRI 10-OCT-2001 

Homo sapiens mRNA for RTN-xL, complete cds . 
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AB04 0462 . 1 GI: 11610574 
reticulon . 

Homo sapiens (human) 
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AUTHORS Eguchi, Y., Tagami,S. and Tsujimoto,Y. 
TITLE Direct Submission 

JOURNAL Submitted (22-MAR-2000) Yutaka Eguchi, Osaka University Graduate 
School of Medicine, Biomedical Research Center, Department of 
Medical Genetics; Yamadaoka 2-2, Suita, Osaka 567-0871, Japan 
(E-mail : eguchi @gene .med. osaka-u . ac . jp, Tel : +81-6-6879-3363, 
Fax:+81-6-6879-3369) 
FEATURES Location/Qualifiers 
source 1. .4166 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
/ tissue_type="brain" 
/tissue_lib="human fetal brain" 
gene 1. .4166 

/gene="RTN-x" 
CDS 245. .3823 

/gene="RTN-x" 
/note="reticulon family" 
/codon_start=l 
/product="RTN-xL" 
/protein_id="BAB18927 . 1" 
/db_xref="GI : 11610575" 

/ trans la tion="MEDLDQSPLVSSSDSPPRPQPAFKYQFVREPEDEEEEEEEEEED 

EDEDLEELEVLERKPAAGLSAAPVPTAPAAGAPLMDFGNDFVPPAPRGPLPAAPPVAP 

ERQPSWDPSPVSSTVPAPSPLSAAAVSPSKLPEDDEPPARPPPPPPASVSPQAEPVWT 

PPAPAPAAPPSTPAAPKRRGSSGSVDETLFALPAASEPVIRSSAENMDLKEQPGNTIS 

AGQEDFPSVLLETAASLPSLSPLSAASFKEHEYLGNLSTVLPTEGTLQENVSEASKEV 

S EKAKTLLI DRDLTEFSELEYSEMGS S FSVS PKAESAVI VANPREEI I VKNKDEEEKL 

VSNNILHNQQELPTALTKLVKEDEWSSEKAKDSFNEKRVAVEAPMREEYADFKPFER 

WEVKDSKEDSDMLAAGGKIESNLESKVDKKCFADSLEQTNHEKDSESSNDDTSFPST 

PEGIKDRSGAYITCAPFNPAATESIATNIFPLLGDPTSENKTDEKKIEEKKAQIVTEK 

NTSTKTSNPFLVAAQDSETDYVTTDNLTKVTEEWANMPEGLTPDLVQEACESELNEV 

TGTKIAYETKMDLVQTSEVMQESLYPAAQLCPSFEESEATPSPVLPDIVMEAPLNSAV 

PSAGASVIQPSSSPLEASSVNYESIKHEPENPPPYEEAMSVSLKKVSGIKEEIKEPEN 

INAALQETEAPYISIACDLIKETKLS7VEPAPDFSDYSEMAKVEQPVPDHSELVEDSSP 

DSEPVDLFSDDSIPDVPQKQDETVMLVKESLTETSFESMIEYENKEKLSALPPEGGKP 

YLES FKLSLDNTKDTLLPDEVSTLSKKEKI PLQMEELSTAVYSNDDLFI SKEAQI RET 

ETFSDSSPIEIIDEFPTLISSKTDSFSKLAREYTDLEVSHKSEIANAPDGAGSLPCTE 

LPHDLSLKNIQPKVEEKISFSDDFSKNGSATSKVLLLPPDVSALATQAEIESIVKPKV 

LVKEAEKKLPSDTEKEDRSPSAIFSAELSKTSWDLLYWRDIKKTGWFGASLFLLLS 

LTVFSIVSVTAYIALALLSVTISFRIYKGVIQAIQKSDEGHPFRAYLESEVAISEELV 

QKYSNSAIjGHWCTIKELRRLFLVDDLVDSLKFAVLMWVFTYV^ 

S LFS VPVI YERHQAQI DHYLGLANKNVKDAMAKI QAKI PGLKRKAE " 

ORIGIN 



Query Match 63.9%; Score 2391.4; DB 9; Length 4166; 

Best Local Similarity 81.0%; Pred. No. 0; 

Matches 3100; Conservative 0; Mismatches 591; Indels 134; Gaps 22; 

Qy 19 GGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAGGCAGCAGAA 78 

I M I I i I I I III I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 2 6 GGCGGCGGCGGCAAGTGGGGACAGGGCGGGTGGCGCATCACCGGCGCGGAGGCAGGAGGA 8 5 

Qy 7 9 GCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGCTCGGCACGA 138 

M I I I I I I I II I I I I I | | | | | | | | | | || | | | | | | | | | | | 
Db 86 GCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCGG 130 



Qy 139 CTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACTCTGAG 198 

III I I III II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 131 CTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCAC7\ACCGCCCGCGGCTCTGAG 190 

Qy 199 GAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCATGGA 257 

I II I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I II I I I I 
Db 191 ACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCCATGGA 24 9 

Qy 258 AGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTCCGCC 317 

I I II I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I III 
Db 250 AGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCGCAGCC 30 6 

Qy 318 CGCCTTCAAGTAC CAGTT CGT GACGGAGCCC GAGGACGAGGAGGACGAGGAGGAGGAGGA 377 

III I I I I I I I I I I I I I I I I I I I I I II I I I 1 I I I I I I I I I I I I! II I I I I I I I I 
Db 307 C G CGT T CAAGT AC C AGT T C GT GAG G GAGC C C GAGGAC GAG GAG GAAGAAGAGGAGGA 363 

Qy 378 GGAC GAG GAG GAGGAC GAC GAGGAC CT AGAG GAAC T GGAGGT G C T GGAGAGGAAGC C C GC 437 

III I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I 
Db 364 G GAAGAGGAGGAC GAG GAC GAAGAC CT GGAG GAGCT G GAG GT GCT G GAGAGGAAGC C C GC 423 

Qy 438 AGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTGCTGGA 491 

I I I I I I I I I I I I I I II I I I I I I II I I I 1 I I I I I I I I I III I I I I 

Db 424 CGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTGATGGA 483 

Qy 492 CTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGC 551 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 4 84 CTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCCCCCGT 543 

Qy 552 CGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCCGCGCC 602 

I I I I I I I I I I II I I I I I I I I i I I I I I I I I I I I I I II I I I I 

Db 54 4 CGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCCGCGCC 603 

Qy 603 ATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCC 662 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I 
Db 604 ATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAGCCTCC 663 

Qy 663 GGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCG 715 

III I I I I II I I I I I II II III III I I I I I I I I I II I I I I I I I 
Db 664 GGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTGTGGAC 723 

Qy 716 CCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGG 755 

I I I I II II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 724 CCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGCAGGGG 7 83 

Qy 756 CTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGAT 812 

I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I 

Db 7 84 CTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCTGTGAT 84 3 

Qy 813 AC CCTCCTCT GC AGAAAAAAT TAT G GAT T T GAT G G AGC AGC C AGGTAAC ACT GT T T CGT C 872 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 844 ACGCTCCTCTGCAGAAAA TATGGACTTGAAGGAGCAGCCAGGTAACACTATTTCGGC 900 

Qy 873 TGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATC 932 

I I I I I I 1 I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I II 
Db 901 TGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCTCTGTC 960 



Qy 933 TCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCAGTGTC 992 

1 I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III I II I I 
Db 961 T C CT CT CT CAG CCGCTTCTTT C AAAGAACAT GAAT AC CT T GGT AAT T T GT CAACAGT ATT 1020 

Qy 993 AT C CT C AGAAGGAACAAT T GAAGAAAC T T T AAAT GAAG CT T C T AAAGAGT T GC C AGAGAG 1052 

I II I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 1021 ACC CACTGAAGGAAC ACTT CAAGAAAAT GT CAGT GAAGCTT CTAAAGAGGT CT CAGAGAA 1080 

Qy 1053 GGCAACAAAT C CATT T GTAAAT AGAGATTTAGCAGAAT TTT CAGAATTAGAAT ATTCAGA 1112 

I I I I I II II I I II I I I I I I I I I I I I I I I I I I I I M I I II I I II I I I I I I I 
Db 1081 GGCAAAAACT CT ACT CAT AGATAGAGAT TTAACAGAGTTTT CAGAATTAGAAT ACT C AGA 1140 

Qy 1113 AAT GGGAT CAT CT TTTAAAGGCT CCCCAAAAGGAGAGT CAGC CATATTAGTAGAAAACAC 1172 

I I I I I I II I I I I II I I II I I I I I I II III II III II I I I II I I I I I 

Db 1141 AAT GGGAT CAT C GTT C AGT GT C T C T CCAAAAGC AGAAT CT G C C GTAAT AGT AGCAAAT C C 1200 

Qy 1173 TAAGGAAGAAGT AATT GT GAGGAGT AAA GACAAAGAGGATTTAGTTT GT AGT GCAGC 1229 

II I I I I I I I I I I I I I I I Mill II I I I I I I I I I I I I I I I I 

Db 1201 T AG GGAAGAAAT AAT C GT GAAAAATAAAGAT GAAGAAGAGAAGT T AGT T AGT AATAAC AT 1260 

Qy 1230 C CT T C ACAGT C C AC AAGAAT CAC C T GT G G GT AAAG AAG AC AGAGT 1274 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I | M 

Db 1261 C C T T CAT AAT CAACAAGAGTT AC C T AC AG CT CT TACTAAAT T GGT TAAAGAGGAT GAAGT 1320 

Qy 1275 T GT GT CT C C AGAAAAGACAAT GGAC ATT T T TAAT GAAAT GCAGAT GT CAGT AGT AGC AC C 1334 

I I I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1321 T GT GT CT T C AGAAAAAGCAAAAGACAGT T T TAAT GAAAAGAGAGTT GCAGT G GAAGCT C C 1380 

Qy 1335 T GT GAGGGAAGAGT AT G CAG ACT T TAAG C CAT T T GAACAAGC AT GGGAAGT GAAAGAT AC 1394 

I II I I I I I II I I I I I II II I I II I I I I I I I I I II I I I I I I I I I I II I I I I I 
Db 1381 TAT GAGG GAGGAAT AT G C AGACT T CAAAC CAT T T GAGC GAGT AT GG GAAGT GAAAGAT A- 1439 

Qy 1395 T TAT GAGG GAAGT AGGGATGTGCTGGCTGCTAGAGCT AAT GT GGAAAG 1442 

I I I I I I I I I I I I I I M I I II I I I I I II I I I I I I I 

Db 1440 — GT AAG GAAGAT AGT GAT AT GTTGGCTGCT GGAGGT AAAAT C GAGAG CAACT T GGAAAG 1497 

Qy 1443 TAAAGT GGACAGAAAAT GC T T G GAAGAT AGC CT G GAG CAAAAAAGT C T T G GGAAG GAT AG 1502 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I II I I II I 

Db 14 98 TAAAGT GGATAAAAAAT GTTTT GCAGAT AGCCTT GAGCAAACTAAT CACGAAAAAGAT AG 1557 

Qy 1503 T GAAGG CAGAAAT GAG GAT GCT TCTTTCCC CAGT AC C C CAGAAC CT GT GAAG GACAGCT C 1562 

III I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I III 

Db 1558 T GAGAGT AGT AAT GAT GAT ACTT CT T T CC C C AGT AC GC C AGAAGGT AT AAAG GAT C GT T C 1617 

Qy 1563 C AGAGC ATAT AT T AC CTGTGCTTCCTT T A C CTCAGCAACCGAAAGCAC CACAGCAAA 1619 

I I I I I I I I I I I I I I I II I I I II II I I I I I I I I I I I I I I I I I I I I 
Db 1618 AGGAGC AT AT AT CAC AT GT G CT C C CT TT AAC C C AGC AGCAACT GAGAG CAT T G CAACAAA 1677 

Qy 162 0 CACTTT C CCT T T GT TAGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAAAT AGA 1679 

II III I I I I I I I I I I I I I I I I I I I II II I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1678 CATTTTT CCT TT GT TAGGAGAT CCT ACTT CAGAAAATAAGACCGAT GAAAAAAAAAT AGA 1737 

Qy 1680 AGAAAGGAAG GC C CAAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT C C T T T 1736 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I I I I 
Db 1738 AGAAAAGAAG G C C CAAAT AGT AAC AGAGAAGAAT ACT AGCAC CAAAAC AT CAAAC C CT T T 1797 

Qy 1737 C CT T GT AGCAGT AC AGGAT T CT GAG GCAGAT TAT GT T ACAAC AGAT AC CT T AT CAAAGGT 1796 



1798 T CTT GTAGCAGCACAGGATT CT GAGAC AGAT TAT GT CACAACAGATAAT TTAACAAAGGT 1857 

17 97 GACT GAG GC AG CAGT GT CAAACAT G C C T GAAGGT CT GACG C C AGAT T T AGT T C AGGAAG C 1856 

I I I I I I I I II III I I I I I I I I I II II I I I I I I I I II I I I I I I I I I MINIM 
1858 GACT GAG GAAGT C GT GGCAAACAT G C C T GAAGGC CT GACT C C AGAT T T AGT AC AGGAAGC 1917 

1857 AT GT GAAAGT GAACT GAAT GAAGCCACAGGTACAAAGATT GCT TAT GAAACAAAAGTGGA 1916 

II I I I I I I II I I I I I I I II II I II I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
1918 AT GT GAAAGT GAAT T GAAT GAAGT T AC T GGT ACAAAGAT T GCT TAT GAAACAAAAAT G GA 1977 

1917 C T T G GT C CAAACAT C AGAAGCTAT ACAAGAAT C ACT T T AC C C C AC AGC AC AG CTTT GC C C 1976 

MINI I I I I I I M I I I I I III I I M I Mill II II I I I I I II M I I II I I I 

197 8 C T T G GT T CAAACAT C AGAAGTT AT G CAAGAGT C ACT CT AT C C T GCAG C AC AG CTTTGCCC 2037 

1977 AT CAT T T GAG GAAG C T GAAGC AACT C C GT C AC CAGT T T T GC CT GAT ATT GT TAT GGAAGC 2036 

I I I I I I I I I II I I I I I I Mill I I II M I I M I I I I I I I I I I I I I I I I I I M I 
2038 AT CAT T T GAAGAGT CAGAAGCT ACT C CT T C AC CAGT T T T GC CT GAC ATT GT TAT GGAAGC 2097 

2037 ACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCC 2 096 

I I I I I I I I I I II I I I I II I I I I M I I M I I I I I I I I I I I I I I I I I 

2098 AC CAT T GAAT T CT GCAGT T C CT AGT GCT GGT G CT T C C GT GAT ACAG C C C AG CT CAT C AC C 2 157 

2097 ACT GGAAGCAC CT C CT C C AGTTAGT TAT GACAGTAT AAAGCT T GAGC CT GAAAAC C C C C C 2156 

I I I I I I II I I I II I I I I I I I I I II I II I I I I II I I I I I I I M I II I I I 
2158 ATTAGAAG CTT CTT CAGTTAATTAT GAAAGCATAAAACAT GAGCCT GAAAACC CCCC 2214 

2157 AC CAT AT GAAGAAGC C AT GAAT GT AG C ACT AAAAGCT T T GG GAACAAAGGAAGGAAT 2213 

I I I I I I I I I I I I I I I I I I I I I I I MM I I I I I I MM I I I I I II I III 
2215 AC CAT AT GAAGAGGC C AT GAGT GT AT C ACT AAAAAAAGT AT CAG GAAT AAAGGAAGAAAT 2274 

2214 AAAAGAGC CT GAAAGT T T T AAT GCAG CT GTT C AGGAAAC AGAAGCT C CT TAT AT AT C CAT 2273 

I I M I I I I I I M I I I I II I I I I I I I I I I I I I I I II I I I I I I M I I II I I I II M 
2275 T AAAGAGC CT GAAAAT AT T AAT G CAG C T CTT CAAGAAAC AGAAGCT C CT TAT AT AT CT AT 2334 

2274 T G CGT GT GAT TT AAT T AAAGAAACAAAGCT CT C CACT GAG C CAAGT C C AGAT T T CT CT AA 2333 

Ml II I I I I I I I I I I I I I II I I I I II I I I II I I I I III III I II I I I I I I I 
2335 T GC AT GT GAT T T AAT T AAAGAAACAAAG CTT T CT GC T GAAC C AGCT C CG GAT T T CT CT GA 2394 

2334 T TAT T C AGAAAT AG CAAAAT T C GAGAAGT CGGT G C C C GAACAC GCT GAG CT AGT GGAGGA 2393 

I I I I I II II I II I II I II I I I I I I I II M I I I I I I II I I I I I I II I I 
2395 T TAT T CAGAAAT GG CAAAAGTT GAAC AG C CAGT G C C T GAT CAT T CT G AG CT AGT T GAAGA 2454 

2394 T T CCT C AC C T GAAT C T GAAC CAGT T GAC TT AT T T AGT GAT GAT T C GATT C CT GAAGT C C C 2453 

II I II II I 11 I I II M II I II I I I M I II M M I I I I I I I II I I II I I II I M M 

2455 T T CCT C AC CT GAT T C T GAAC CAGT T GACT TAT T T AGT GAT GAT T CAAT AC C T GAC GTT CC 2514 

2454 ACAAACACAAGAG GAG GCT GT GAT GCT CAT GAAG GAGAGT CT CACT G A AGTGTC 2507 

I I I I I I I I I I I I I I I I I II M I I I I II I I I I I I I I M II I II 
2515 ACAAAAACAAGAT GAAACT GT GAT G CT T GT GAAAGAAAGT CT CACT GAGACT T C AT TT GA 2 574 

2508 T GAGACAGTAGCCCAGCACAAAGAGGAGAGACTTAGTGCCTCACCT CAGGAGCTAGGAAA 2567 

I III I Ml I I I I I I II I I I I I I III III I M I I I 
2575 GT CAAT GAT AGAAT AT GAAAAT AAG GAAAAAC T CAGT GCT T T G C C AC CT GAGG GAGGAAA 2 634 

2568 GC CAT AT T T AGAGT C T T T T CAG C C CAAT T T ACAT AGT ACAAAAGAT G C TGCATCTAA 2 624 

I II II II I I I I II II M II I II I M I I I I I I II II I I I I I I I I 



Db 2635 GC CAT AT T T GGAAT C T T T T AAG CT C AGTT T AGAT AACACAAAAGAT AC C CT GT T AC CT GA 2694 

Qy 2 625 TGACATT CCAACATT GACCAAAAAGGAGAAAATTT CTTT GCAAAT GGAAGAGTTTAATAC 2684 

III II I I I I I I I I I I I I I II II I I I i I I I I I I I I I I I I I I I I III I I I I I 
Db 2695 T G AAGT T T C AAC AT T G AGC AAAAAG G AGAAAAT T C CT T T G C AGAT G GAG GAG C T C AGT AC 2754 

Qy 268 5 TGCAAT TTATT CAAAT GAT GACTT ACTTT CTT CTAAGGAAGACAAAATAAAAGAAAGT GA 2744 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I III 
Db 2755 T GC AGT T T ATT CAAAT GAT GACT T AT T TAT T T CTAAGGAAGCACAGAT AAG AGAAACT GA 2814 

Qy 27 45 AACAT T T T C AG AT T CAT CT C C GAT T G AGAT AAT AGAT GAAT T T C C C AC GT T T GT C AGT GC 2804 

III I I I I I I I I I I I I I I I I I Mill II I I I I I I I I I I II II II I I I I I I 
Db 2815 AAC GT T T T CAGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C C T AC AT T GAT C AGT T C 2874 

Qy 2805 T AAAG AT GAT T C T C C T AAAT TAG C C AAGGAGT AC ACT GAT CT AGAAGT AT C C GACAA 2 8 61 

I I I I I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I II I 
Db 2 875 TAAAACT GAT T CAT TT T C TAAAT T AGC C AGG GAAT AT AC T GAC CT AGAAGT AT C C CACAA 2934 

Qy 2 8 62 AAGTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTG 2921 

I I I I I I II I I I I I I I I II I I II I I I I I I I II I I II I I I I I I I I I I 

Db 2935 AAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTGCCCCA 2994 

Qy 2922 T GAC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAG TACATGTTT CAGAT GA 2 975 

I I I I I I I I I I I I I I I II III I II MM I I I I I I I I M II I M I 

Db 2995 T G AC CT T T CT T T GAAGAACATACAAC C CAAAGT T GAAGAGAAAAT C AGT T T CT CAGAT GA 3054 

Qy 2 976 AT T C T C C GAAAAT AG GT C C AGT GT AT CTAAG GC AT C CAT AT C GC CT T CAAAT GT CT CT GC 3035 

I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I M I 

Db 3055 CTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTTTCTGC 3114 

Qy 3 036 T T T GGAAC CT C AGACAGAAAT GG G CAGC AT AGT TAAAT C CAAAT CACTT AC GAAAGAAG C 3095 

I I I I I I I I I I I I I I I I I M M I I I I I I I I II II III MINIMI 
Db 3115 TTT GGCCACT CAAGCAGAGATAGAGAGCAT AGT TAAACCCAAAGTT CTT GT GAAAGAAGC 3174 

Qy 3096 AGAGAAAAAACTT C CTT CT GACACAGAGAAAGAGGACAGAT CCCT GT CAGCT GT ATT GTC 3155 

II I II I I I I I I I I I I I I II I I II I II I I I I I II I I I I I I II III I I I I II 

Db 3175 T GAGAAAAAACTT C CTT CCGAT ACAGAAAAAGAGGACAGAT CACCAT CT GCTAT ATTTT C 3234 

Qy 3156 AG CAGAGCT GAGTAAAACT T CAGT T GT T G AC CT C CT CT ACT GGAGAGAC AT TAAGAAGAC 3215 

I I I I II II I I II I I I I I I I I I I I M I II II II I I I I I I I I I I I I I II II I I I I M I II I 
Db 3235 AG CAGAGC T GAGTAAAACT T CAGT T GT T GAC CT C CT GT ACT GGAGAGAC AT TAAGAAGAC 32 94 

Qy 3216 TGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGT 3275 

I I I I II I I I I M I I I I I I I II I II M I II II I I I I II I I I I I II I I I I I I I I I I I 
Db 32 95 TGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGCATTGT 3354 

Qy 3276 CAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATA 3335 

II I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I II II I I I I I II I II 
Db 3355 GAG C GT AACAGC C T ACAT TGCCTTGGCCCTGCTCTC T GT GAC CAT C AG CTT T AGGAT AT A 3414 

Qy 3336 TAAG GGC GT GAT C C AGG CT AT C C AGAAAT CAGAT GAAGG C CAC C CAT T C AGGG C ATAT T T 3395 

I I I I I I I I I I I I I I I I II I I II I II I I M I I I I I I I I I I I I I I II II I I I I I I I I I 
Db 3415 CAAG GGT GT GAT C CAAG CT AT C CAGAAAT CAGAT GAAGG C CAC C CAT T C AGGG CAT AT CT 3474 

Qy 3396 AGAAT CT GAAGTT G CT AT AT CAGAG GAAT T GGT T CAGAAAT ACAGT AAT TCTGCTCTTGG 3455 

I II II I I I I I I I I I I I I I I I I I I I I II II I I I II II I I II I I II II I I I I I I I II I 
Db 34 75 GGAAT CT GAAGT T G CT AT AT CT GAG GAGT T GGT T C AGAAGT ACAGT AAT TCTGCTCTTGG 3534 



Qy 3456 T CAT GT GAAC AGC ACAAT AAAAGAACT GAGGC G GC TT T T CT T AGT T GAT GAT T T AGT T GA 3515 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I 
Db 3535 T CAT GT GAAC T G C AC GAT AAAGGAAC T C AG GCGCCTCTTCT T AGT T GAT GAT T T AGT T GA 3594 

Qy 3516 TTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGG 3575 

III I I I I I I I I I I I I I I I I I I I I I I I I I I Mill I I 1 I I I I I I I I I I I I I I I I I I I 
Db 3595 TTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGG 3654 

Qy 3576 T C T GACACT ACT GAT T T TAG C T C T GAT C T C ACT CT T CAGT AT T C CT GT TAT T TAT GAAC G 3635 

I I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3655 TCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTATGAACG 3714 

Qy 3636 G CAT C AGGTGC AGAT AGAT CAT TAT CT AGGAC T T G CAAAC AAGAGT GT TAAGGAT GC CAT 3695 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I II 
Db 3715 G CAT C AGG CAC AGAT AGAT CAT TAT CT AGGACT T GCAAAT AAGAAT GT T AAAGAT GCT AT 3774 

Qy 3696 G GC C AAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC GCAAAGC AGA 3740 

III I I I I I I I I I I M I I I I 1 I I I I I I I I I I I I I I I I I I I I I II 
Db 3775 GGCTAAAATCCAAGCAAAAATCCCTGGATTGAAGCGCAAAGCTGA 3819 
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/chromosome="2 n 

/map="2pl6" 
gene 1. .4789 

/gene="RTN4" 

/no t e= " s ynonym : NOGO 11 
5'UTR 1. .244 

/gene="RTN4" 

/evidence=experimental 
CDS 245. .3823 

/gene="RTN4" 

/note="NOGO-A; RTN4-A; alternatively spliced" 
/codon_start=l 
/product="RTN4 isoform A" 
/protein_id="AAM64248 . 1" 
/db__xref="GI: 26800573" 

/ trans la tion="MEDLDQSPLVSSSDSPPRPQPAFKYQEVREPEDEEEEEEEEEED 
EDEDLEEL E VL E RKPAAGL S AAP VP T AP AAGAP LMD FGN D EVP PAP RGP L PAAP P VAP 
ERQPSWDPSPVSSTVPAPSPLSAAAVSPSKLPEDDEPPARPPPPPPASVSPQAEPWT 
P P AP AP AAP P S T P AAP KRRG S S GS VDET L FAL P AAS E P VI RS S AENMDLKEQ P GNT I S 
AGQEDFPSVLLETAASLPSLSPLSAASFKEHEYLGNLSTVLPTEGTLQENVSEASKEV 
SEKAKTLLIDRDLTEFSELEYSEMGSSFSVSPKAESAVIVANPREEIIVKNKDEEEKL 
VSNNILHNQQELPTALTKLVKEDEWSSEKAKDSFNEKRVAVEAPMREEYADFKPFER 
WEVKDSKEDSDMLAAGGKIESNLESKVDKKCF7VDSLEQTNHEKDSESSNDDTSFPST 
PEGIKDRSGAYITCAP FNPAATESIATNIFPLLGDPTSENKTDEKKIEEKKAQIVTEK 
NTSTKTSNPFLVAAQDSETDYVTTDNLTKVTEEWANMPEGLTPDLVQEACESELNEV 
TGTKIAYETKMDLVQTSEVMQESLYPAAQLCPSFEESEATPSPVLPDIVMEAPLNSAV 
PSAGASVIQPSSSPLEASSVNYESIKHEPENPPPYEEAMSVSLKKVSGIKEEIKEPEN 
INAALQETEAPYISIACDLIKETKLSAEPAPDFSDYSEMAKVEQPVPDHSELVEDSSP 
DSEPVDLFSDDSIPDVPQKQDETVMLVKESLTETSFESMIEYENKEKLSALPPEGGKP 
YLESFKLSLDNTKDTLLPDEVSTLSKKEKIPLQMEELSTAVYSNDDLFISKEAQIRET 
ETFSDSSPIEIIDEFPTLISSKTDSFSKLAREYTDLEVSHKSEIANAPDGAGSLPCTE 
LPHDLSLKNIQPKVEEKISFSDDFSKNGSATSKVXLLPPDVSALATQAEIESIVKPKV 
LVKEAEKKLPSDTEKEDRSPSAIFSAELSKTSWDLLYWRDIKKTGWFGASLFLLLS 
LTVFSIVSVTAYIALALLSVTISFRIYKGVIQAIQKSDEGHPFRAYLESEVAISEELV 
QKYSNSALGHWCTIKELRRLFLVDDLVDSLKFAVIJyfWFTYVGALFNGLTLLIIA^ 
SLFSVPVIYERHQAQIDHYLGLANKNVKDAMAKIQAKIPGLKRKAE" 
3 1 UTR 3824. .4789 



ORIGIN 



/gene="RTN4" 



Query Match 63.9%; Score 2391.4; DB 9; Length 4789; 

Best Local Similarity 81.0%; Pred. No. 0; 

Matches 3100; Conservative 0; Mismatches 591; Indels 134; Gaps 22; 

Qy 19 GGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAGGCAGCAGAA 7 8 

I I I I I I I I I III I I I I I I I I I I I I I I M I MM I MM I II II I II I 

Db 26 GGCGGCGGCGGCAAGTGGGGACAGGGCGGGTGGCGCATCACCGGCGCGGAGGCAGGAGGA 8 5 

Qy 79 GCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGCTCGGCACGA 138 

M I II I I I I II II I I I II I I I I I M I M I I I I I I I II I I 
Db 86 GCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCGG 130 

Qy 139 CTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACTCTGAG 198 

III I I III M I II I II I M I I I I I M II II M I I I M I M I I I II II I M 

Db 131 CTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCTCTGAG 190 



Qy 



199 GAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCATGGA 257 



I II I I 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 191 ACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCCATGGA 24 9 

Qy 258 AGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTCCGCC 317 

I I I I I I I I I I I I I I I I I I I I I I! I I I I I I I I I I I I I I I II I I I I I III 
Db 250 AGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCGCAGCC 306 

Qy 318 C G C C T T CAAGT AC CAGT T C GT GAC GGAG C C C GAG GAC GAG GAGGAC GAGGAGGAGGAGGA 377 

III I I I I I I II I I I I I II I I I I I I I I I I I II I I I I I I I I II || II I I I II I I I 
Db 307 C GC GT T CAAGT AC CAGT T C GT GAGG G AG C C C GAGGAC GAGG AG GAAGAAGAGGAGGA 363 

Qy 378 GGACGAGGAGGAGGAC GAC GAGGAC CTAGAGGAACT GGAGGT GCT GGAGAGGAAGC CCGC 437 

III I I I I I I II II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 364 G GAAGAGGAGGAC GAG GAC GAAGAC CT GGAG GAG CT G GAG GT GCT GGAGAGGAAG C C C GC 423 

Qy 438 AGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTGCTGGA 4 91 

I I I I I I I I I I I I I I I I Mill I I I I II I I I I I I I I I I I I I I II I 

Db 424 CGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTGATGGA 4 83 

Qy 4 92 CTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGC 551 

I I I I I I I I I I I I I II II I I I I I I I I II II I I I II I I I I I I I I I I I I 

Db 484 CTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCCCCCGT 54 3 

Qy 552 CGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCCGCGCC 602 

II I I I I I I I I I I II I II II I I I I I I I I I I I I I I I I I I I I I 

Db 544 CGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCCGCGCC 603 

Qy 603 ATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCC 662 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 
Db 604 ATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAGCCTCC 663 

Qy 663 GGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCG 715 

III I II I II I I I I I II II III III I I I I I I I I Ml I I I I I I I 

Db 664 GGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTGTGGAC 723 

Qy 716 CCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGG 755 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 724 CCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGCAGGGG 783 

Qy 756 CTCC GGCT CAGT GGATGAGACCCTTTTT GCT CTTCCT GCT GCATCTGAGCCTGTGAT 812 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I II I I I I I I 

Db 784 CTCCTCGGGCT CAGT GGATGAGACCCTTTTT GCT CTTCCT GCT GCATCTGAGCCTGTGAT 843 

Qy 813 AC C C T C CT CT GCAGAAAAAAT TAT GGAT T T GAT GGAGCAGC C AGGT AAC ACT GT T T C GT C 872 

II I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I I I I I I I I I I Mill I 

Db 844 ACGCTCCTCTGCAGAAAA TAT G GAC T T GAAGGAG CAG C CAGGT AACACT AT T T C GGC 900 

Qy 873 TGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATC 932 

I I I I I II I I I I I I I I II I II I I I I I I I I I I I II II II I I I I I II II I I I I I I II I I II 
Db 901 TGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCTCTGTC 960 

Qy 933 TCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCAGTGTC 992 

I II I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I II II III I I I I I 
Db 961 TCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACAGTATT 102 0 

Qy 993 AT C C T C AGAAG GAAC AAT T GAAG AAAC T T T AAAT GAAG C T T C T AAAG AGT T G C C AGAG AG 1052 

I II I I I II II M I II I I II I I I I I I I II I II I I I M II I I I I II I I I 



Db 



1021 AC C CAC T GAAGGAAC ACT T CAAGAAAAT GT C AGT GAAG C T T CT AAAGAG GT C T C AGAGAA 108 0 



Qy 1053 G G C AAC AAAT C CAT T T GT AAAT AG AGAT T TAG C AGAAT T T T C AGAAT T AGAAT AT T C AG A 1112 

I I I I I II II I I II I I I I I I I I M I I I II I I I I II I I I I I I I I I I I I I I I I 
Db 1081 GGCAAAAACTCTACT CAT AGAT AGAGATTTAACAGAGTTTT CAGAATT AGAAT ACT CAGA 1140 

Qy 1113 AAT GGGAT CAT CT T T TAAAGGCT C C C CAAAAGGAGAGT C AG C CAT AT T AGT AGAAAAC AC 1172 

I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I 1 I I I I I I I I I I I I I I I I 
Db 1141 AAT GGGAT CAT CGTT CAGTGT CT CT C CAAAAGCAGAAT CT GC C GTAAT AGTAGCAAAT CC 1200 

Qy 1173 TAAGGAAGAAGT AAT T GT GAG GAGT AAA GAC AAAGAGGAT T T AGT T T GT AGT GC AG C 1229 

M I I I I I I I I M I I II I I I I I I II I I I I I I I I I I M I I I I 
Db 1201 T AGG GAAGAAATAAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT AACAT 1260 

Qy 1230 C CTT CAC AGT C CACAAGAAT CAC CT GT GGGT AAAGAAGACAGAGT 1274 

I 1 I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I III 

Db 1261 C C TT C ATAAT CAACAAGAGT T AC CT AC AGCT CT TACT AAAT T G GT T AAAGAGGAT GAAGT 1320 

Qy 1275 T GT GT CT C C AGAAAAGACAAT G GAC AT TT TT AAT GAAAT GC AGAT GT CAGT AGT AG CAC C 1334 

I I I I I M II I I I I I III I I I I I I I I I I I I I I I I I INI I III II 

Db 1321 T GT GT CT T C AGAAAAAGCAAAAGAC AGTT T T AAT GAAAAGAGAGT T G CAGT GGAAGCT C C 1380 

Qy 1335 T GT GAGGGAAGAGT AT GC AGACT T T AAGC CAT T T GAACAAG CAT GGGAAGT GAAAGAT AC 1394 

I I I I I I I I II I I I I I I I I I I I II II I I I I I I I II I I I I I I I I M M I I I I I 
Db 1381 TAT GAGGGAGGAAT AT GCAGACTT CAAACCATTTGAGCGAGTAT GGGAAGTGAAAGATA- 1439 

Qy 1395 TT AT GAGGGAAGTAGGGAT GT GCT GGCT GCT AGAGCT AATGTGGAAAG 1442 

I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 144 0 — GTAAGGAAGAT AGT GAT AT GT TGGCTGCT GGAG GT AAAAT C GAGAG CAACT T GGAAAG 14 97 

Qy 1443 T AAAGT GGAC AGAAAAT GCT T G GAAGAT AG C CT GGAGCAAAAAAGT CTT GGGAAGGAT AG 1502 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III I I I I I I I I 

Db 14 98 T AAAGT G GAT AAAAAAT GT T TT G C AGAT AG C CT T GAGCAAACT AAT CAC GAAAAAGAT AG 1557 

Qy 1503 T GAAG GCAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC C T GT GAAGGAC AGCT C 1562 

III I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I II III 
Db 1558 T GAGAGTAGTAAT GATGATACTTCTTTCCCCAGTACGCCAGAAGGTATAAAGGATCGTTC 1617 

Qy 1563 C AGAGCAT AT ATT AC CT GT GCT T C CT T T A C CT CAGCAACCGAAAGCAC CACAGCAAA 1619 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 1618 AGGAGCAT AT AT CAC AT GT GCT C C CT T T AAC C CAG CAG CAACT GAGAGC AT T GCAAC AAA 1677 

Qy 162 0 CAC TTTCCCTTT GT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AGAT GAAAAAAAAAT AGA 1679 

II III I I I I I I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1678 CAT TTTTCCTTT GT T AGGAGAT C CT AC T T C AGAAAAT AAGAC C GAT GAAAAAAAAAT AGA 1737 

Qy 1680 AGAAAGGAAGGCC CAAATTATAAC AGAGAAG ACTAGCCCCAAAACGTCAAATCCTTT 1736 

Mill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 11,1111 I II II I I I I I 
Db 1738 AGAAAAGAAG GC C CAAAT AGT AAC AGAGAAGAAT ACT AGCAC CAAAAC AT CAAAC C CT T T 1797 

Qy 1737 C CTT GT AGCAGTACAGGAT T CT GAG G C AGAT TAT GT T ACAAC AGAT AC C T TAT CAAAGGT 1796 

I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I M I I Ml I I I I M I 
Db 1798 T CTT GT AG CAGCACAGGAT T CT GAGAC AGAT TAT GT C ACAAC AGAT AAT T T AAC AAAGGT 1857 

Qy 17 97 GACTGAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATTTAGTTCAGGAAGC 1856 

I II I I I I I M III I I I I M I I I I I I I II I II I I I I M I II I I I I I I II I I I I I 
Db 1858 GAC T GAGGAAGT C GT G GCAAAC AT G C CT GAAGG C CT GAC T C C AGAT T T AGT AC AGGAAGC 1917 



Qy 1857 AT GT GAAAGT GAAC T GAAT GAAG C C AC AGGT AC AAAGAT T G CT T AT GAAACAAAAGT G G A 1916 

I I I I I II I I I I I I I I I I I I I II II I I I I I I I I II I I I I I I I I II I I I I I I I I I I I 
Db 1918 AT GT GAAAGT GAAT T GAAT GAAGT T ACT GGT AC AAAGAT T GC T TAT GAAACAAAAAT G GA 1977 

Qy 1917 CT TGGT C CAAAC AT C AGAAGC T AT ACAAGAAT C ACT TT AC C C CACAG C AC AGCT TT GC C C 1976 

I II I I I I I I I I I I I I I I I I III Mill I I I I I II II I I I I I I I I I I I I I I I I 
Db 1978 CTTGGTTCAAACATCAGAAGTTATGCAAGAGTCACTCTATCCTGCAGCACAGCTTTGCCC 2037 

Qy 1977 AT CAT T T GAG GAAG C T GAAGCAACT C C GT C AC C AGT TT T GC C T GAT AT T GT T AT GGAAGC 2036 

I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 2038 AT CAT T T GAAGAGT C AGAAGC TACT C CTT C AC CAGT TT T GC CT GAC AT T GT TAT GGAAGC 2097 

Qy 2037 ACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCC 2096 

I I I I I I I I I M I I I I I II I II I I I I I I I t I I I I I I I I I I I Mill 

Db 2098 AC CAT T GAAT T CT GC AGT T C CT AGT GCTGGTGCTTCCGT GAT AC AGC C CAG CT CAT CAC C 2157 

Qy 2097 AC T G GAAGC AC CT C CT C CAGT T AGT TAT GAC AGT AT AAAGCT T GAGC CT GAAAAC C CC C C 2156 

I I II I I II II I I I I I I I I I II I II I I I II I I I I II I II II I I I I I I I I 
Db 2158 ATTAGAAG CT T CT T CAGT T AAT TAT GAAAGCAT AAAACAT GAGC C T GAAAAC C C C C C 2214 

Qy 2157 AC CAT AT GAAGAAG C CAT GAAT GT AGC AC T AAAAGCTTTGGGAACAAAGGAAGGAAT 2213 

I I I II M I I I I I I I I I II I I II I MM I I II I I I I I I I I I I I I M Ml 

Db 2215 ACCATATGAAGAGGCCATGAGTGTATCACTAAAAAAAGTATCAGGAATAAAGGAAGAAAT 2274 

Qy 2214 AAAAGAGC CT GAAAGT T T T AAT GCAGCT GT T C AGGAAAC AGAAGCT C CT T AT AT AT CC AT 2273 

I I I I I I I I I I I I I I I M I I II M II I II I I I II II I I I I II I II I I II I I I I II 
Db 2275 T AAAGAGC CT GAAAAT AT T AAT GCAGCT CT T CAAGAAAC AGAAGCT C CT T ATAT AT CT AT 2334 

Qy 2274 T G CGT GT GAT T T AAT TAAAGAAACAAAGCT CT C CACT GAGC CAAGT C C AGAT T T CT CT AA 2333 

III I I I I II II I II I I I II I I I II I I I I I II I I I I III II I I II I M I I I I 
Db 2335 T GC AT GT GAT T T AAT T AAAGAAACAAAGCTT T C T GCT GAAC CAGCT C C G GAT T T CT CT GA 2394 

Qy 2334 T TAT T CAGAAAT AG CAAAAT T C GAGAAGT C G GT G C C CGAACAC GCT G AGCT AGT GGAGG A 2393 

I I I I I I I II I I I I I M I I I I I I I I I I II I I I I I I I I I I I I II I I I M 
Db 2395 T TAT T CAGAAAT GGCAAAAGTT GAAC AGC CAGT G C CT GAT CAT T CT GAGCT AGT T GAAGA 2454 

Qy 2394 TT CCT CAC CT GAAT CT GAAC CAGTT GACTT ATTTAGTGAT GATT CGATT CCT GAAGTCC C 2453 

I I I I I I I I I I I I II M I I II I I I II I I I M I I II I I I I I I I I I I M I I I I I II II 
Db 2 455 T T CC T CAC CT GAT T CT GAAC CAGT T GACT T AT T TAGT GAT GAT T CAAT AC C T G ACGTT C C 2514 

Qy 2 4 54 ACAAACACAAGAGGAGGCTGT GAT GCT CAT GAAGGAGAGTCT CACT GA AGTGTC 2507 

I II I I I II I II 1 I I I II I I I II I I I I I I I I I I M I II I I I II 
Db 2515 ACAAAAACAAGAT GAAACT GT GAT GC T T GT GAAAGAAAGT CT CACT GAGAC T T C AT TT GA 2574 

Qy 2508 T GAGACAGT AG CC C AG CAC AAAG AG G AGAGACT TAGT GC CT CAC CT C AGGAGC T AG GAAA 2567 

I III I Ml II II I I II I I II I I II I III II I I M 
Db 2575 GT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT GCT T T GC CAC CT GAGGGAGGAAA 2634 

Qy 2568 G C CAT AT T T AGAGT CT T T T CAG C C CAAT T T AC AT AGT AC AAAAGAT GC TGCATCTAA 2624 

I I I II I I I I M II II I I II I II I I I I I II I I I I I I I I I I I I I I 

Db 2635 G C CAT AT T T G GAAT C T T T T AAG C T CAGT T TAG AT AAC AC AAAAGAT AC C C T GT T AC C T G A 2694 

Qy 2625 T GACATT C CAACATT GAC CAAAAAGGAGAAAATTT CTTT GCAAAT GGAAGAGTTTAAT AC 2684 

III II I I I M I 11 I II II I I II I I II I I M I I I I I I I I I I II III I I III 
Db 2695 T GAAGT T T CAACAT T GAGCAAAAAGGAGAAAAT T CCTTT GCAGAT GGAGGAGCT CAGT AC 27 54 



Qy 2685 TGCAATTT AT T CAAAT GAT GACTTACTTT CT T CTAAGGAAGACAAAATAAAAGAAAGT GA 2744 

I I I I I I I I I I I I I I I I 1 I I I I I I I II I I I I I I I I I I I | M I | I I I | | IN 
Db 2755 T GCAGTTTATT CAAAT GAT GACTT ATTTATTT CTAAGGAAGCACAGATAAGAGAAACT GA 2814 

Qy 27 45 AAC AT T T T CAGAT T CAT CT CC GAT T GAGATAAT AGAT GAAT T T C C CAC GT T T GT CAGT G C 2 8 04 

I M I I I I II I I I I II I I II I I I I I I II I I I I I I I I II || || || I I | M I 
Db 2815 AAC GT T T T CAGAT T CAT CT C CAAT T GAAAT TAT AGAT G AGT T C CC T AC AT T GAT CAGT T C 2874 

Qy 2805 T AAAGAT GAT T C T C CTAAATT AGCCAAGGAGT ACACT GAT CT AGAAGT AT C CGACAA 2861 

I I M I I I I I I I I I I I I I I I I I I I III II I I I I I I I I I I I I I I I I I I I I I 
Db 2875 T AAAAC T GAT T CAT T T T CT AAAT T AGC CAGGGAAT AT AC T GAC C T AGAAGT AT C C CACAA 2934 

Qy 2 862 AAGT GAAAT T GCT AAT AT C CAAAGC G GGGC AGAT T CAT TGCCTTGCT T AGAAT T G C C C T G 2 921 

I M I I I I I I I I M I I I II I I I I I I I I I I I I I II II I I I I I I I I I I 

Db 2935 AAGT GAAAT T G C T AAT GC C C C G GAT GGAG CT G GGT CAT T GC CT T GCAC AGAAT T GC C C C A 2994 

Qy 2 922 T GAC CTTT CTTT CAAGAATATAT AT C CTAAAGAT GAAG TACATGTTT CAGAT GA 2975 

I I I I II I I II I I I I I I III I II I I I I I I I I I | | | I I I I I I I I 

Db 2995 T GAC CT T T C T T T GAAGAAC AT AC AAC C CAAAGT T GAAGAGAAAAT CAGT T T CT CAGAT G A 3 054 

Qy 2 976 AT T C T C C GAAAAT AG GT CC AGT GT AT CTAAGGC AT C CAT AT CGC CT T CAAAT GT CT CT G C 3035 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I 

Db 3055 CTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTTTCTGC 3114 

Qy 3036 T T T G GAAC CT C AGAC AGAAAT GGGC AGCAT AGT TAAAT C CAAAT C ACT T AC GAAAGAAGC 3095 

I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I Ml I II I I I I II 
Db 3115 T TT GGC CAC T CAAGC AGAGAT AGAGAG CAT AGT T AAAC C CAAAGT T CT T GT GAAAGAAG C 317 4 

Qy 3096 AGAGAAAAAACT T C CTT CT GACACAGAGAAAGAGGACAGATCC CT GT CAGCT GTATT GT C 3155 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I M I I I I II 

Db 3175 T GAGAAAAAACT T C C T T C C GAT AC AGAAAAAGAG GAC AGAT CAC CAT CT GCT AT AT TT T C 3234 

Qy 3156 AGCAGAGC T GAGT AAAAC T T CAGT T GT T GAC C T C C T CT ACT GGAGAGACAT TAAGAAGAC 3215 

I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I 
Db 3235 AGCAGAG CT GAGT AAAACT T CAGT T GT T GAC CT C CT GT ACT GGAGAGACAT TAAGAAGAC 32 94 

Qy 3216 TGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGT 3275 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I II I I I I I II I I I I I 
Db 3295 TGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGCATTGT 3354 

Qy 327 6 CAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATA 3335 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I I 

Db 3355 GAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGGATATA 3414 

Qy 3336 T AAGGGC GT GAT C C AGGC T AT C C AGAAAT CAGAT GAAGG C CAC C CAT T C AGGGCAT AT T T 3395 

I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 3415 CAAGG GT GT GAT C CAAG CT AT C C AGAAAT CAGAT GAAGGCCAC C C ATT C AGGGCAT AT C T 3474 

Qy 3396 AGAAT C T GAAGT T GCT AT AT CAG AGGAAT T GGT T CAGAAAT AC AGTAAT TCTGCTCTTGG 3455 

I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 3475 G GAAT CT GAAGT T G CT AT AT C T GAGGAGT T G GT T CAGAAGT AC AGT AAT TCTGCTCTTGG 3534 

Qy 3456 T CAT GT GAAC AG C ACAAT AAAAGAAC T GAG GCGGCTTTT CT T AGT T GAT GAT T T AGTT GA 3515 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I I I I I I I I I I | I I I I I I I I I I I 
Db 3535 T CAT GT GAAC T GCAC GATAAAGGAACTCAGGCGCCTCTTCTTAGTT GAT GAT TT AGT TGA 3594 

Qy 3516 TTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGG 3575 







1 1 I 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I I I || I | | | | | M 




Db 


3595 


TTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGG 


3654 


Qy 


3576 


T C T GAC ACT ACT GAT T TT AGCT CT GAT CT CACT CTT CAGT AT T CC T GT T ATT TAT GAAC G 


3635 






1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II || I I I I I I I I I I || II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

ri ■■■■■ is iiicifi ri riiri i | 1 1 1 1 1 1 J 1 1 1 t J 1 ' ' 1 ■ 1 1 1 1 1 I 1 1 f 1 | 1 L 1 




Db 


3655 


TCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTATGAACG 


3714 


Qy 


3636 


GCAT CAGGT GC AGATAGAT CATTAT CT AGGACTT GCAAACAAGAGT GT TAAGGAT GCCAT 


3695 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 




Db 


3715 


GCAT CAGGCACAGAT AGAT CAT T AT CTAGGACTT GC AAATAAGAAT GTTAAAGAT GCTAT 


3774 


Qy 


3696 


GG C C AAAAT C CAAG CAAAAAT C C CT GGAT T GAAGCGCAAAGC AGA 37 4 0 








Ml M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I I I I I I I || 1 1 1 1 1 1 1 1 II 




Db 


3775 


GGCT AAAAT CCAAGCAAAAAT C C CT GGATT GAAGCGCAAAGCT GA 3819 
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AC131431 218532 bp DNA linear HTG 19-NOV-2002 

Rattus norvegicus clone CH230-256K14 , WORKING DRAFT SEQUENCE. 
AC131431 

AC131431. 3 GI: 25084347 

HTG; HTGS_PHASE2; HTGS_DRAFT; HTGS_FULLTOP . 
Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 218532) 
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Allen, C, Allen, H., Alsbrooks , S . , Amin,A., Anguiano,D., 
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Biswalo,K., Blair, J., Blankenburg, K. , Blyth,P., Brown, M. , 
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Chacko,J., Chavez, D., Chen,G., Chen,R., Chen,Y., Chen, Z . , Chu,J., 
Cleveland, C. , Cockrell, R. , Cox,C, Coyle,M. , Cree,A., D r Souza,L., 
Davila,M.L., Davis, C, Davy-Carroll, L. , DeAnda,C, Dederich,D., 
Delgado,0., Denson,S., Deramo,C, Ding,Y., Dinh,H., Divya,K., 
Draper, H., Dugan-Rocha, S . , Dunn,A. , Durbin,K., Duval, B., Eaves, K. , 
Egan,A., Escotto,M., Eugene, C, Evans, C. A., Falls, T., Fan,G., 
Fernandez, S . , Finley,M., Flagg,N., Forbes, L., Foster, M. , Foster, P., 
Fraser,C.M., Gabisi,A., Ganta,R., Garcia, A. , Garner, T., Garza, M. , 
Gebregeorgis, E. , Geer,K., Gill,R., Grady, M. , Guerra,W., Guevara, W. , 
Gunaratne, P . , Haaland,W., Hamil,C, Hamilton, C, Hamilton, K. , 
Harvey, Y., Havlak,P., Hawes,A. , Henderson, N . , Hernandez , J. , 

, Hladun,S.L., Hodgson, A. , Hogues,M., 
, Hulyk,S., Hume, J. , Idlebird,D., Jackson, A. , 
Jiang, H., Johnson, B,, Johnson, R., Jolivet,A., 
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Lorensuhewa, L. , Loulseged, H . , Lozado,R.J., Lu,X., Ma, J., 
Maheshwari,M. , Mahindartne, M. , Mahmoud,M,, Malloy,K., Mangum,A. , 
Mangum,B., Mapua,P., Martin, K., Martin, R. , Martinez, E., 
Mawhiney,S., McLeod,M.P., McNeill , T . Z . , Meenen,E., 



Hernandez, R. , Hines,S. 
Hollins,B., Howells,S. 
Jackson, L., Jacob, L., 
Karpathy, S . , Kelly, S . , 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



Milosavl jevic, A. , Miner, G., Minja,E., Montemayor, J . , Moore, S., 
Morgan, M., Morris, K., Morris, S., Munidasa,M. , Murphy, M. , Nair,L., 
Nankervis, C. , Neal,D., Newton, N . , Nguyen, N., Norris,S., 
Nwaokelemeh, O. , Okwuonu,G., Olarnpunsagoon, A. , Pal,S., Parks, K. , 
Pasternak, S. , Paul,H., Perez , A. , Perez, L., Pf annkoch, C . , 
Plopper,F., Poindexter, A. , Popovic,D., Primus,E., Pu,L.-L., 
Puazo,M., Quiroz,J., Rachlin,E., Reeves, K., Regier,M.A., Reign, R., 
Reilly,B., Reilly,M. , Ren,Y., Reuter,M., Richards,S., Riggs,F., 
Rives, C, Rodkey,T., Rojas,A., Rose,M., Rose,R., Ruiz, S. J., 
Sanders,W., Savery,G., Scherer,S., Scott, G. , Shatsman, S . , Shen,H., 
Shetty,J., Shvartsbeyn, A. , Sisson,I., Sitter, CD., Smajs,D., 
Sneed,A., Sodergren, E . , Song,X.-Z., Sorelle,R., Sosa,J., 
Steimle,M., Strong, R., Sutton, A., Svatek,A., Tabor, P., Taylor, C, 
Taylor, T., Thomas, N., Thomas, S., Tingey,A., Trejos,Z., Usmani,K., 
Valas,R., Vera,V., Villasana, D . , Waldron,L., Walker, B., Wang, J. , 
Wang,Q., Wang,S., Warren, J., Warren, R., Wei,X., White, F. , 
Williams , G. , Willson,R., Wleczyk,R., Wooden, H., Worley,K. f 
Wright, D., Wright, R. , Wu,J., Yakub,S., Yen, J., Yoon,L., Yoon,V., 
Yu,F., Zhang, J., Zhou, J., Zhou,X., Zhao,S., Dunn,D., von 
Niederhausern, A. , Weiss, R., Smith, D.R., Holt, R. A., Smith, H.O., 
Weinstock,G. and Gibbs,R.A. 
Direct Submission 
Unpublished 

2 (bases 1 to 218532) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted (22-AUG-2002 } Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

3 (bases 1 to 218532) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted ( 19-NOV-2 002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Nov 19, 2002 this sequence version replaced gi:23101715. 
The sequence in this assembly is a combination of BAC based reads 
and whole genome shotgun sequencing reads assembled using Atlas 
(http://www.hgsc.bcm.tmc.edu/projects/rat/). Each contig described 
in the feature table below represents a scaffold in the Atlas 
assembly (a 1 contig-scaf fold f ) . Within each contig-scaf f old, 
individual sequence contigs are ordered and oriented, and separated 
by sized gaps filled with Ns to the estimated size. The sequence 
may extend beyond the ends of the clone and there may be sequence 
contigs within a contig-scaf fold that consist entirely of whole 
genome shotgun sequence reads. Both end sequences and whole genome 
shotgun sequence only contigs will be indicated in the feature 
table. 

Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site : http: //www. hgsc . bcm. tmc . edu/ 

Contact: hgsc-help@bcm.tmc.edu 
Project Information 

Center project name: GMSN 

Center clone name: CH230-256K14 
Summary Statistics 



FEATURES 

source 



Assembly program: Phrap; version 0.990329 

Consensus quality: 185268 bases at least Q40 

Consensus quality: 186898 bases at least Q30 

Consensus quality: 188068 bases at least Q20 

Estimated insert size: 191844; sum-of-contigs estimation 

Quality coverage: 7x in Q20 bases; sum-of-contigs estimation 



misc feature 



NOTE: Estimated insert size may differ from sequence length 

(see http: //www. hgsc .bcm. tmc . edu/docs/Genbank_draf t_data . html) 
NOTE: This is a 'working draft 1 sequence. It currently 
consists of 1 contigs. Gaps between the contigs 
are represented as runs of N. The order of the pieces 
is believed to be correct as given, however the sizes 
of the gaps between them are based on estimates that have 
provided by the submittor. 
This sequence will be replaced 

by the finished sequence as soon as it is available and 
the accession number will be preserved. 

1 218532: contig of 218532 bp in length. 

Location/Qualifiers 

1. .218532 

/organism="Rattus norvegicus" 
/mol_type="genomic DNA" 
/db_xref="taxon: 10116" 
/clone="CH230-256K14" 
1. .2141 

/note="wgs_contig" 



ORIGIN 



Query Match 62 . 9%; 

Best Local Similarity 99.7%; 
Matches 2358; Conservative 



Score 2353 .8; DB 2; 
Pred. No. 0; 
0; Mismatches 7; 



Length 218532; 
Indels 0; Gaps 



0; 



Qy 

Db 

147629 



824 CAGAAAAAAT T AT GGATT T GAT GGAGC AGC CAGGT AACAC T GT T T CGT CT GGT CAAGAGG 
I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
147 68 8 CAGAAAAAAT TAT GGATT T GAT G GAGC AGC C AG GT AACAC T GT T T CGT CT GGT CAAGAGG 



883 



Qy 

Db 

147569 



8 84 ATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAA 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
147 628 ATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAA 



943 



Qy 

Db 

147509 



94 4 CTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCAGTGTCATCCTCAGAAG 1003 
I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I i I I I I I I II I I II I I I I I I I I II II I I 
14 7568 CT GT T T CT T TTAAAGAAC AT GGAT AC C T T GGTAACT TAT C AGCAGT GT CAT C C T C AGAAG 



Qy 

Db 

147449 



1004 GAACAAT T GAAGAAACT T T AAAT GAAG CT T CTAAAGAGTT GC CAGAGAGG GC AACAAAT C 1063 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 1 II I I I I M I I 
14 7508 GAACAAT T GAAGAAACT T T AAAT GAAG CT T C T AAAGAGT T GC CAGAGAGGG C AACAAAT C 



Qy 

Db 

147389 



1064 CATTT GTAAAT AGAGATTTAGCAGAATTT T CAGAATT AGAAT ATT CAGAAAT GGGATC AT 1123 
I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I M I I 
1474 4 8 CATTT GTAAAT AGAGATTTAGCAGAATT T T CAGAATT AGAAT AT T CAGAAAT GGGAT CAT 



Qy .1124 C T T T T AAAGG C T C C C CAAAAGGAGAGT C AGC CAT AT T AGT AGAAAAC AC T AAG GAAGAAG 118 3 

I I I I I I I I M I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I 

Db 147388 CT T T T AAAGGCT C C C C AAAAG GAGAGT C AGC CAT AT T AGT AGAAAAC AC T AAGGAAGAAG 

147329 



QY H84 TAAT T GT GAG GAGTAAAGAC AAAGAG GAT T T AGTT T GTAGT GC AG C C CT T C ACAGT C C AC 1243 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | 
Db 147328 TAAT T GT GAGGAGTAAAGACAAAGAG GAT T T AGTT T GTAGT GC AGC C C T T C AC AGT C C AC 

147269 



Qy 124 4 AAGAAT C ACCT GT G G GT AAAGAAGACAGAGT T GT GT CTC C AGAAAAGACAAT GGAC AT T T 1303 

M II I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I | | | | M | | | | | | | | | | | | | I M 
Db 147268 AAGAAT CACCT GT G GGTAAAGAAGACAGAGT T GT GT C T C C AGAAAAGACAAT GGAC AT T T 

147209 



Qy 1304 T TAAT GAAAT GC AGAT GT C AGT AGT AG C AC CT GT GAGGGAAGAGT AT GC AGACTT TAAG C 1363 

M I I I I I I I I II I I M I I I I I I I II M | | | | | | M | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 14 7208 TTAAT GAAAT GCAGAT GT CAGTAGTAGCACCT GT GAGGGAAGAGT AT GCAGACTTTAAGC 

147149 



Qy 1364 CAT T T GAACAAGC AT G GGAAGT GAAAGAT AC T TAT GAGG GAAGT AGG GAT GT GCTGGCTG 1423 

I M I I I I I I I I I I I I I I I I I I M I II I I I I II II I I I I I II I I II I I I M I I I | I 

Db 14 714 8 CAT TT GAACAAGC AT GGGAAGT GAAAGAT ACT TAT GAGGGAAGTAGGGAT GT GCT GGCT G 

147089 



Qy 1424 CTAGAGCTAAT GTGGAAAGTAAAGT GGACAGAAAAT GCTTGGAAGAT AGCCT GGAGCAAA 14 8 3 

M M I I I II I I I I I I II I I II I I I I I I I I II I I I I I M I I I I I I II I I II I I I I II I I I I 
Db 147088 CTAGAGCTAATGTGGAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCT GGAGCAAA 

147029 



Qy 14 8 4 AAAGT CT T G G GAAG GAT AGT GAAGG CAGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C C AG 1543 

I I I I I I I I I I M I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I || 
Db 14 7 028 AAAGT C T T G GGAAGGAT AGT GAAGG CAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AG 

146969 



Qy 1544 AACCTGTGAAGGACAGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCTCAGCAACCG 1603 

I I M I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 
Db 14 69 68 AAC C T GT GAAG GACAG CT CCAGAGC AT AT AT T AC CTGTGCTTCCTT T AC CT C AGCAAC C G 

146909 



Qy 1604 AAAGCAC CACAGCAAACACTTT CCCT TT GTT AGAAGAT CATACTT CAGAAAATAAAACAG 1663 

I I I I I I I I M I M I I I I I I I I I I I I I I I II I I I II I I I I I I I I I II I M I I I I I I I I I I I 
Db 146908 AAAG CAC CAC AG C AAAC ACT TT C C CT T T GT T AGAAGAT CATACTT CAGAAAATAAAACAG 

146849 



Qy 1664 AT GAAAAAAAAAT AGAAGAAAGGAAGGC C CAAAT T AT AAC AGAGAAGACT AGC C C CAAAA 1723 

I I I I I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 146848 AT GAAAAAAAAAT AGAAGAAAGGAAG GC C CAAAT TAT AAC AGAGAAGACT AG C C C CAAAA 

146789 



Qy 1724 C GT CAAAT CCTTTCCTT GT AGC AGT ACAG GAT T C T GAG G C AGAT TAT GT T AC AACAGAT A 1783 

I I I I I I I M M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I 
Db 14 6788 C GT CAAAT CCTTTCCTT GT AGC AGT ACAGGAT T CT GAGGC AGAT TAT GT T ACAAC AGAT A 

146729 



Qy 1784 CCTTATCAAAGGTGACTGAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATT 1843 



Db 14 672 8 C CT TAT CAAAG GT GACT GAG G C AG C AGT GT C AAAC AT G C CT GAAGGT CT GAC G C C AGATT 

146669 

Qy 1844 T AGT T C AGGAAGC AT GT GAAAGT GAACT GAAT GAAGC CAC AG GT ACAAAGAT T GCT T AT G 1903 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I I I I I II M I I I I II 
Db 14 6668 TAGTTCAGGAAG CAT GT GAAAGT GAACT GAAT GAAGC CACAGGT ACAAAGAT T GCT TAT G 

146609 

Qy 1904 AAACAAAAGT G GAC T T G GT C CAAACAT C AGAAGCT AT AC AAGAAT CAC T T T AC C C CACAG 1963 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I II II I I I I I I I I I I I I I I I I I I I | I I | 
Db 14 6608 AAACAAAAGT GGACTT GGT C CAAACAT CAGAAGCTAT ACAAGAAT CACT TTACCCCACAG 

146549 

Qy 1964 C AC AGCT T T G CC C AT CAT T T GAGGAAGC T GAAGCAACT C CGT CAC C AGT T T T GC CT GAT A 2023 

I I I M II I I I I I I I I I I I I I I I I I I || I I I I I I I I I | | M I I I M I I I I I I I I I I I I I I I 
Db 146548 CACAGCTTTGCCCATCATTTGAGGAAGCT GAAGCAACT CCGTCACCAGTTTTGCCT GAT A 

146489 

Qy 2 02 4 TTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGC 208 3 

I I I I I I M I I I I I I I I I I I I II I I I I I M I I M I I I I I I I I I I I I I II II I I I I I I I I I I 
Db 146488 TTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGC 

146429 

Qy 2 084 CCAGTGTATCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGC 2143 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || I I I | | 
Db 146428 C C AGT GT AT C C C C ACT GGAAGC AC CT C CT C C ACT T AGT TAT GAC AGT AT AAAGCT T GAGC 

146369 

Qy 2144 CT GAAAAC C C C CCACCATAT GAAGAAGCCAT GAAT GTAGCACTAAAAGCTTT GGGAACAA 22 03 

I I I I I I I I I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 14 6368 CT GAAAAC C C C C CAC CAT AT GAAGAAGC CAT GAAC GT AGC ACT AAAAGCT T T G GGAACAA 

146309 

Qy 2204 AGGAAG GAAT AAAAGAG C C T GAAAGT T T TAAT G CAG C T GT T CAGGAAAC AGAAG CT C CT T 2263 

I I I I I I I II I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 
Db 146308 AGGAAGGAAT AAAAGAGC CT GAAAGT T T TAAT GCAGCT GT T CAGGAAAC AGAAGCT C C T T 

146249 

Qy 2264 AT AT AT C CAT T GC GT GT GAT T TAAT T AAAGAAACAAAGC T CT C CACT GAGC CAAGT C CAG 2323 

I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 14 624 8 AT AT AT C CAT T GC GT GT GAT TTAAT TAAAGAAACAAAG CT C T C CACT GAGC CAAGT C CAG 

146189 

Qy 2324 AT T T CT CTAAT T AT TC AGAAAT AG CAAAAT T C GAGAAGT C GGT GC C C GAAC AC GCT GAGC 2383 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 14618 8 AT T T CT CTAAT TAT T CAGAAAT AGC AAAAT T C GAGAAGT C G GT GC CC GAAC AC GCT GAG C 

146129 

Qy 2 38 4 TAGT G GAGGAT T C CT CAC CT GAAT CT GAAC CAGTT GACT TAT TTAGT GAT GAT TC GAT TC 2443 

I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I || I I | I I M I 
Db 14 6128 TAGT GGAGGAT T C CT CAC CT GAAT CT GAAC CAGTT GAC T TAT T TAGT GAT GAT T C GAT T C 

146069 

Qy 24 4 4 C T GAAGT C C CAC AAAC ACAAGAG GAG G CT GT GAT GCT CAT GAAG GAGAGT CT CACT GAAG 2503 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 14 6068 CTGAAGTCCCACAAACACAAGAGGAGGCTGTGATGCTCATGAAGGAGAGTCTCACTGAAG 

146009 



Qy 2504 T GT CT GAGAC AGT AGC C C AGC AC AAAGAG GAGAGACT T AGT G CC T C AC CT C AGGAGCT AG 2563 

M I I I I I I I I I I I I I I | | | | I | | || || | | | | | | | | | M I I I I I I I I I I 

Db 14 6008 T GT CT GAGAC AGT AGC C C AG C ACAAAGAGGAGAGACT T AGT G CCT C AC CT C AGGAGCT AG 

145949 

Qy 2564 GAAAGCCATATTTAGAGTCTTTTCAGCCCAATTTACATAGTAC7VAAAGATGCTGCATCTA 2623 

I I I M I I I I I I I I I I I II I I I I I I | | I I I | I M | | | | || | | || | | | | M I I I I II 

Db 14594 8 GAAAGCCATATTTAGAGTCTTTTCAGCCCAATTTACATAGTACAAAAGATGCTGCATCTA 

145889 

Qy 2624 AT GAC AT T C CAAC AT T GAC CAAAAAG GAGAAAAT T T CT TT GCAAAT GGAAGAGT T T AAT A 2683 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II II I II I I I I I I I I I I I I I I I I I I I | | || 
Db 14 58 88 AT GAC AT T C CAACAT T GAC CAAAAAGGAGAAAAT TT CT TT GCAAAT GGAAG AGTT T AAT A 

145829 

Qy 2 68 4 CT G CAAT T TAT T C AAAT GAT GACT T ACT T T C T T CT AAGGAAGAC AAAAT AAAAGAAAGT G 27 43 

I I I I M I I I I I I I I I I I I I I I I I I I I I I | M I I II I I I I I I I I I I I I I I I I I I I I II I I I 
Db 145828 C T GCAAT T TAT T CAAAT GAT GACT T ACTT T C T T CTAAGGAAGACAAAAT AAAAGAAAGT G 

145769 

Qy 2744 AAAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AG AT GAAT T T C C C AC GT T T GT C AG T G 28 03 

I I I I M I I I I II M I I I I I I I I I I II I I I I II I II I I II I II I I I I I I II I I I I I I I I | | 
Db 145768 AAACAT T T T CAGAT T CAT CT C C GAT T GAGATAAT AGAT GAAT T T C C CAC GT T T GT CAGT G 

145709 

Qy 2 804 C T AAAGAT GAT T CT C CT AAAT TAG C CAAGGAGT ACACT GAT C T AGAAGT AT C C GACAAAA 28 63 

I I I I I I I I M I I I II I I I I I I I II I II II I I I I I I I I I I I I I I I I I I I I I I I I || I I I | | 
Db 145708 CT AAAGAT GAT T C T C C T AAAT TAG C CAAGGAGTACACT GAT C T AGAAGT AT C C GACAAAA 

145649 

Qy 2 8 64 GT GAAAT T GCTAAT AT C CAAAGC G GGGC AGAT T CAT TGCCTTGCT T AGAAT T GC C CT GT G 2 923 

I I I M I I I II I I I I I I I I I I I I I I I | | | | M | | | | | | | | | | | | | | | | | || || | | | | | | | | 
Db 14564 8 GT GAAAT T G CT AAT AT C CAAAG C GGGG C AGATT CAT TGCCTTGCT T AGAAT T G C C CT GT G 

145589 

Qy 2 924 ACCTTT CTTT CAAGAATATAT AT C CTAAAGAT GAAGTACAT GTTT CAGAT GAATT CT CC G 2 983 

I I I I I I I I I I I I 1 I I I I I II I I I I I II I I I I II I I I I I I I I I I I I I I I I II || I M II M 
Db 14 55 8 8 ACCT TTCTTT CAAGAATATAT AT C CTAAAGAT GAAGTACAT GTTT CAGAT GAATT CTCC G 

145529 

Qy 2 984 AAAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCTTTGGAAC 3043 

I I I I I I I I I I I I I I I M I I I I I I I I I I I I || I || I I I I I | | | | || | | | | | | | M | | | | | | 
Db 145528 AAAAT AGGTCCAGTGTATCTAAGGCATCCATATCGCCTT CAAAT GTCTCTGCTTTGGAAC 

145469 

Qy 3044 CT C AGACAGAAAT G GGCAGC AT AGTT AAAT C CAAAT C ACT T AC GAAAGAAGCAGAGAAAA 3103 

I I M I II I I I I II I I I I I I I I I I 1 I | | | | | | | | | | | | || | | | | | | | | | | | | | | | | | M II 
Db 145468 CT CAGACAGAAAT GGGCAGCAT AGTTAAAT CCAAAT CAC T T AC GAAAG AAG C AG AGAAAA 

145409 

Qy 3104 AACTT CCT T CT GACACAGAGAAAGAGGACAGAT CC CT GT CAGCT GTATT GT CAGCAGAGC 3163 

I M II II I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | || M | | || | | |i | || | | | | | | 
Db 1454 08 AACTT CCTT CT GACACAGAGAAAGAGGACAGAT CCCT GT CAGCT GTATT GT CAGCAGAGC 

145349 



Qy 3164 TGAGTAAAACTTCAGTTGTTGACCT 318 8 

I I I I I I I I I I I I I I I I I I I I 
Db 14534 8 TGAGTAAAACTT CAGGTAATAAT CT 145324 



RESULT 11 

AC133315 

LOCUS 

DEFINITION 

ACCESSION 
VERSION 
KEYWORDS 
SOURCE 

ORGANISM 



REFERENCE 
AUTHORS 



AC133315 238341 bp DNA linear HTG 19-NOV-2002 

Rattus norvegicus clone CH230-525 J22, WORKING DRAFT SEQUENCE, 2 
unordered pieces . 
AC133315 

AC133315.2 GI : 25073594 

HTG; HTGS_PHASE1; HTGS_DRAFT; HTGS_FULLTOP . 
Rattus norvegicus (Norway rat) 
Rattus norvegicus 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; 
Rattus . 

1 (bases 1 to 238341) 

Muzny, D.Marie. , Metzker , M. Lee . , Abramzon,S., Adams, C, Alder, J., 
Allen, C, Allen, H., Alsbrooks , S . , Amin,A., Anguiano,D., 
Anyalebechi, V. , Aoyagi,A., Ayodeji,M., Baca,E., Baden, H., 
Baldwin, D., Bandaranaike, D . , Barber, M., Barnstead, M. , Benahmed,F., 
Biswalo,K., Blair, J., Blankenburg, K . , Blyth,P., Brown, M. , 
Bryant, N., Buhay,C, Burch,P., Burrell,K., Calderon,E., 
Cardenas, V., Carter, K., Cavazos,I., Ceasar,H., Center, A., 
Chacko,J., Chavez, D., Chen,G., Chen,R., Chen,Y., Chen,Z., Chu,J., 
Cleveland, C. , Cockrell,R., Cox,C, Coyle,M., Cree,A., D'Souza,L., 
Davila,M.L., Davis, C, Davy-Carroll, L . , DeAnda,C, Dederich,D., 
Delgado,C, Denson,S., Deramo,C, Ding,Y., Dinh,H., Divya,K., 
Draper, H., Dugan-Rocha, S . , Dunn, A. , Durbin,K., Duval, B., Eaves, K., 
Egan,A., Escotto,M., Eugene, C. , Evans, C. A., Falls, T., Fan,G., 
Fernandez, S . , Finley,M. , Flagg,N., Forbes, L., Foster, M. , Foster, P., 
Fraser,C.M., Gabisi,A., Ganta,R., Garcia, A. , Garner, T., Garza, M. , 
Gebregeorgis,E. , Geer,K., Gill,R,, Grady, M. , Guerra,W., Guevara, W. , 
Gunaratne, P. , Haaland,W., Hamil,C, Hamilton, C, Hamilton, K. , 
Harvey, Y,, Havlak,P., Hawes,A., Henderson, N . , Hernandez , J. , 

, Hladun,S.L., Hodgson, A. , Hogues,M., 
, Hulyk,S., Hume, J., Idlebird,D., Jackson, A. , 
Jiang, H. , Johnson, B., Johnson, R. , Jolivet,A., 
Kelly, S., Khan,Z., King,L., Kovar,C, 
Kowis,C, Kraft, C.L., Lebow,H., Levan,J., Lewis, L., Li,Z., Liu, J., 
Liu, J., Liu,W., Liu, Y . , London, P., Longacre,S., Lopez, J., 
Lorensuhewa, L. , Loulseged, H . , Lozado,R.J., Lu,X., Ma, J., 
Maheshwari,M. , Mahindartne,M. , Mahmoud,M., Malloy,K., Mangum,A., 
Mangum, B., Mapua,P., Martin, K. , Martin, R. , Martinez, E., 
Mawhiney,S., McLeod,M.P., McNeill , T . Z . , Meenen,E., 
Milosavljevic,A. , Miner, G. , Minja,E., Montemayor , J . , Moore, S., 
Morgan, M. , Morris, K., Morris, S., Munidasa,M., Murphy, M. , Nair,L., 
Nankervis,C. , Neal,D., Newton, N., Nguyen, N., Norris,S., 
Nwaokelemeh, O. , Okwuonu,G., Olarnpunsagoon, A. , Pal,S., Parks, K., 
Pasternak, S. , Paul,H., Perez, A., Perez, L., Pf annkoch, C . , 
Plopper,F., Poindexter, A. , Popovic,D., Primus, E., Pu,L.-L., 
Puazo,M., Quiroz,J., Rachlin,E., Reeves, K., Regier,M.A. , Reigh,R., 
Reilly,B., Reilly,M., Ren,Y., Reuter,M., Richards, S., Riggs,F., 
Rives, C, Rodkey,T., Rojas,A., Rose,M., Rose,R., Ruiz, S. J., 



Hernandez, R. , Hines, S , 
Hollins , B . , Howells , S , 
Jackson, L., Jacob, L., 
Karpathy, S . , Kelly, S . , 



TITLE 
JOURNAL 
REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



REFERENCE 
AUTHORS 
TITLE 
JOURNAL 



COMMENT 



Sanders, W., Savery,G., Scherer,S., Scott, G., Shatsman,S., Shen,H., 
Shetty,J., Shvartsbeyn, A. , Sisson,I., Sitter, CD., Smajs,D., 
Sneed,A. , Sodergren, E . , Song,X.-Z., Sorelle,R., Sosa,J., 
Steimle,M. , Strong, R. , Sutton, A., Svatek,A. , Tabor, P., Taylor, C, 
Taylor, T., Thomas , N . , Thomas, S., Tingey,A., Trejos,Z., Usmani,K., 
Valas,R., Vera,V., Villasana, D . , Waldron,L., Walker, B . , Wang, J. , 
Wang,Q., Wang,S., Warren, J. , Warren, R. , Wei,X., White, F. , 
Williams, G. , Willson,R., Wleczyk,R., Wooden, H., Worley,K., 
Wright, D., Wright, R. , Wu,J., Yakub,S., Yen, J., Yoon,L., Yoon,V., 
Yu,F., Zhang, J., Zhou, J., Zhou,X., Zhao,S., Dunn,D., von 
Niederhausern,A. , Weiss, R. , Smith, D.R., Holt, R. A., Smith, H.O., 
Weinstock,G. and Gibbs,R.A. 
Direct Submission 
Unpublished 

2 (bases 1 to 238341) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted ( 10-SEP-2 002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

3 (bases 1 to 238341) 

Rat Genome Sequencing Consortium. 
Direct Submission 

Submitted ( 19-NOV-2002 ) Human Genome Sequencing Center, Department 
of Molecular and Human Genetics, Baylor College of Medicine, One 
Baylor Plaza, Houston, TX 77030, USA 

On Nov 19, 2002 this sequence version replaced gi: 22771260. 
The sequence in this assembly is a combination of BAC based reads 
and whole genome shotgun sequencing reads assembled using Atlas 
(http://www.hgsc.bcm.tmc.edu/projects/rat/). Each contig described 
in the feature table below represents a scaffold in the Atlas 
assembly (a 1 contig-scaf f old ' ) . Within each contig-scaf fold, 
individual sequence contigs are ordered and oriented, and separated 
by sized gaps filled with Ns to the estimated size. The sequence 
may extend beyond the ends of the clone and there may be sequence 
contigs within a contig-scaf fold that consist entirely of whole 
genome shotgun sequence reads . Both end sequences and whole genome 
shotgun sequence only contigs will be indicated in the feature 
table . 

— Genome Center 

Center: Baylor College of Medicine 
Center code: BCM 

Web site: http://www.hgsc.bcm.tmc.edu/ 

Contact: hgsc-help@bcm.tmc.edu 
Project Information 

Center project name: KAYT 

Center clone name: CH230-525J22 
Summary Statistics 

Assembly program: Phrap; version 0.990329 

Consensus quality: 219617 bases at least Q40 

Consensus quality: 221191 bases at least Q30 

Consensus quality: 222226 bases at least Q20 

Estimated insert size: 224889; sum-of-contigs estimation 

Quality coverage: 7x in Q20 bases; sum-of-contigs estimation 



* NOTE: Estimated insert size may differ from sequence length 

* (see http://www.hgsc.bcm.tmc.edu/docs/Genbank_draft data.html) 



FEATURES 

source 



misc feature 



misc feature 



NOTE: This is a 'working draft 1 sequence. It currently 
consists of 2 contigs . The true order of the pieces 
is not known and their order in this sequence record is 
arbitrary. Gaps between the contigs are represented as 
runs of N, but the exact sizes of the gaps are unknown. 
This record will be updated with the finished sequence 
as soon as it is available and the accession number will 
be preserved. 

1 237118: contig of 237118 bp in length 
237119 237218: gap of unknown length 
237219 238341: contig of 1123 bp in length. 

Location/Qualif iers 

1. .238341 

/organism= n Rattus norvegicus" 
/mol_t ype= " genomi c DNA" 
/db_xref="taxon: 10116" 
/clone="CH230-525J22" 
1. .878 

/ note="clone_boundary 
clone_end: T7 
site : 

end_s equen'ce :BZ205043" 
24965. .25861 
/ note="clone_boundary 
clone_end: Sp6 
site : 

end_sequence: BZ2 0504 5" 



ORIGIN 



Query Match 62.9%; 
Best Local Similarity 99.7%; 
Matches 2358; Conservative 



Score 2353. 8; DB 2; 
Pred. No. 0; 
0; Mismatches 7; 



Length 238341; 
Indels 0; Gaps 



0; 



Qy 

Db 

126237 



824 C AGAAAAAATT AT G GAT T T GAT GGAGC AG C C AGGT AAC ACT GT TT C GT CT GGT C AAGAGG 
I M M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I | | 
12 6178 CAGAAAAAATTATGGATTT GAT GGAGCAGCCAGGTAACACT GTTT CGT CT GGT CAAGAGG 



883 



Qy 

Db 

126297 



884 ATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAA 
I I M M I I I I I I II I I I I I I I I I I I | || | | | | | || | | | | | | | | | | | | | | | | | | | | | | | | | 
12 6238 ATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAA 



943 



Qy 

Db 

126357 



944 CTGTTTCTTT TAAAGAAC AT G GAT AC C T T GGT AACT TAT C AGC AGTGT CAT C CT C AGAAG 1003 
I I I i I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I | | | | | | | | || | | M I I I I I 
1262 98 CTGTTTCTTT T AAAGAACAT GGAT AC C T T G GT AACT TAT C AGC AGT GT CAT C C T C AGAAG 



Qy 

Db 

126417 



1004 GAAC AAT T GAAGAAACTT T AAAT GAAG CT T CT AAAGAGT T GC C AGAGAGGG C AAC AAAT C 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
12 6358 GAACAAT T GAAGAAAC T T T AAAT GAAG C T T C T AAAGAGTT GC C AGAGAGGGCAACAAAT C 



1063 



Qy 

Db 

126477 



1064 CAT T T GT AAAT AGAGAT T T AGC AGAAT T T T C AGAAT T AGAAT AT T C AGAAAT G GGAT CAT 1123 
I M M I I I I I I I I I I I I I I || | | | | | | | | | | | | | || | | || | | | | | | | | | | | | | | | | | | | | 
12 6418 CAT T T GT AAAT AGAGAT T T AGCAGAAT T T T C AGAAT T AGAAT AT T C AGAAAT GG GAT CAT 



1124 C T T T T AAAG G C T C C C C AAAAG G AGAG T C AG C CAT AT T AGT AGAAAAC AC T AAG GAAGAAG 1183 
I N I I I I I I I I M I I I I I I I I I I I I I I I I | ! | | | | | | | | | | | | | | | | | | | | M | M I I I I 
Db 126478 CTTTTAAAGGCTCCCCAAAAGGAGAGTCAGCCATATTAGTAGAAAACACTAAGGAAGAAG 



126537 



Qy 1184 T AAT T GT GAGGAGT AAAGACAAAGAG GAT T T AGT TT GT AGT GC AGC C C T T C AC AGT C CAC 1243 

N I I I I I M I I I I I I I I II I I I I I I I II | | | | | | | | | || I I II I I I I I I I I I I I I 

Db 12 6538 T AAT T GT GAGGAGT AAAGACAAAGAGGAT T T AGT TT GT AGT G C AGC C C T T CAC AGT C CAC 



126597 



Qy 1244 AAGAAT CAC CT GT GGGT AAAGAAGACAGAGT T GT GT CT C CAGAAAAGAC AAT GGACAT T T 1303 

N I M I I I I II I I I I I I I | | | | | | | | | | | || | | | | | | | | | || | || || | | | | | | | | | | | | | 
Db 12 6598 AAGAAT CAC CTGT GGGTAAAGAAGACAGAGTT GTGT CT C CAGAAAAGACAAT GGACATTT 



126657 



Qy 1304 T TAAT GAAAT GC AGAT GT C AGT AGT AGCAC C T GT GAGGGAAGAGT AT GC AGACT T TAAG C 1363 

I I M M I I I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | | || | || | | | | | | | | | | | | || | | 
Db 12 6658 TTAAT GAAAT GC AGAT GTC AGT AGT AGC AC CTGT GAGGGAAGAGT AT GC AGACT T TAAG C 



126717 



Qy 1364 C ATT T GAACAAGC AT G G GAAGT GAAAGAT ACT T AT GAGGGAAGT AGGGAT GT G C T GGC T G 1423 

I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | I I I I I I I I I I I I I I I I I M II I I I I 

Db 12 6718 C ATTT GAACAAGC AT G G GAAGT GAAAGAT AC T TAT GAGGGAAGT AG GGAT GT GCTGGCTG 



126777 

Qy 1424 CT AGAGCTAAT GT GGAAAGTAAAGT G G AC AGAAAAT GC TT GGAAGAT AGC C T G GAGCAAA 14 83 

I I M I I I I I I I II I I I I II I I I I I I I I I | | | | | | || | | | | | | | | | | | | | | | || | | | | || | 

Db 126778 CTAGAGCTAATGTGGAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAA 

126837 

Qy 14 8 4 AAAGT CT T G GGAAGGAT AGT GAAGGC AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C CC AG 1543 

I M I I I M I I I I I I I I I I I ! I I | | M I I I I I I I I I I I I I M I I I I I I I I I I I I I I M I I I 

Db 12 6838 AAAGT CT T G GGAAGGAT AGT GAAGGC AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C CCAG 

126897 

Qy 1544 AAC CTGT GAAGGAC AG CT C C AGAGC AT AT ATT AC CT GTGCTTCCTT T AC CT C AGCAAC C G 1603 

I I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | M I I I I I I I I I I II I I I I M 
Db 12 68 98 AAC C T GT GAAG GACAGC T C C AGAGCAT AT ATT AC CTGTGCTTCCTT T AC CT CAGCAAC C G 

126957 

Qy 1604 AAAGC AC CAC AGCAAAC AC TTTCCCTTT GT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AG 1663 

I M I M I I I I I I I I I I I I I I I I I | M M I I I I M I I I I I M I I I I I M I I I I I I II I I I I 

Db 126958 AAAG CAC CAC AG C AAAC AC TTTCCCTTTGT T AGAAGAT CAT AC T T C AG AAAAT AAAAC AG 

127017 

Qy 1664 AT GAAAAAAAAAT AGAAGAAAGGAAGGCCCAAATTATAACAGAGAAGACTAGC C C CAAAA 1723 

I I I I I M I I I I I I I I I I I I I I I I II I I I I I I II I I 1 I I I I I I I II I I I || I I I I I | I I I | 
Db 127018 AT GAAAAAAAAAT AGAAGAAAGGAAGGCCCAAATTATAACAGAGAAGACT AGCC CCAAAA 

127077 

Qy 1724 C GT CAAAT CCTTTCCTT GT AG C AGT ACAGGAT T CT GAGGCAGAT TAT GT T ACAAC AGAT A 1783 

I M I I I I I II I I II I M | M | | | | | | | | | | | | | | | | | M | | | || | | | | || | | | | | 

Db 127 07 8 C GT CAAAT CCTTTCCTT GT AG C AGT ACAGGAT T CT GAGG CAGAT TAT GT T ACAAC AGAT A 

127137 



Qy 1784 C CT TAT CAAAGGT GAC T GAG G C AG C AGT GT C AAAC AT GC C T GAAG GT CT GACG C CAGAT T 1843 



I I I I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

Db 127138 C C T TAT C AAAG GT GAC T GAGGC AGC AGT GT C AAAC AT GC C T GAAG GT C T GAC GC CAGAT T 

127197 

Qy 1844 T AGT T C AG GAAG CAT GT GAAAGT GAAC T GAAT GAAGC C AC AG GT AC AAAGAT T GCT T AT G 1903 

I 1 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I 

Db 127198 T AGT T CAG GAAG CAT GT GAAAGT GAACT GAAT GAAG C CAC AG GT ACAAAGAT T G CTT AT G 

127257 

Qy 1904 AAACAAAAGT G GACT T G GT C CAAAC AT C AGAAGC T AT ACAAGAAT CAC T T T ACC C CAC AG 1963 

I N I I I I I I I I I I I I I I I I I I I I I I I I | | | | | M I I I I I I I I I I I I II I I I I I 

Db 127258 AAACAAAAGT G GAC T T G GT C CAAAC AT C AGAAGC TAT ACAAGAAT C ACT T T ACC C CAC AG 

127317 

Qy 19 64 C AC AGCT T T GC C CAT CAT T T GAG GAAGC T GAAGCAACT C C GT CAC CAGT T T T GC CT GAT A 2 023 

I I I I N I I I I I I I I I I I I I | I | | | | | | | || | | | | | | | | | | | | | | | || || | | | | | | | | | | | 
Db 127318 C AC AGCTT T GC C CAT CAT T T GAG GAAGC T GAAGC AAC T C C GT CAC CAGT T T T GC C T GAT A 

127377 

Qy 2 024 TTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGC 2083 

I " M I I I I I I I I I | | | | | | | M I I I I I II M I I I I I I | | | | | || | | | | || | | | | 

Db 127 378 TTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGC 

127437 

QY 2 084 CCAGT GTATC CC CACT GGAAGCACCT C CT C CAGTTAGTT AT GAC AGT ATAAAGCTTGAGC 2143 

I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | || | | | | || II II II I I I I I I | | || 
Db 127438 C CAGT GT AT C C C CACT GGAAGC AC CT C CT C C ACT T AGTT AT GACAGT ATAAAGCT T GAGC 

127497 

QY 2144 CT GAAAAC CC CCCAC CAT AT GAAGAAGC CAT GAAT GTAGCACTAAAAGCTT T GGGAACAA 2203 

I I N I I I I I I I I I | | | | | | | | | | || | | | | | | || | | | | | | | | | | | | | | | | | | | , | 

Db 1274 98 CT GAAAAC CC CCCAC CATAT GAAGAAGC CAT GAAC GTAGCACTAAAAGCTTT GGGAACAA 

127557 

Qy 22 04 AG GAAG GAATAAAAGAGCCT GAAAGT TTTAATGCAGCTGTTCAGGAAACAGAAGCTC CTT 22 63 

I I I I I I I M I I I II I I I I | | | | | | || | | | || Mill I I I I I I 

Db 127558 AGGAAG GAAT AAAAGAGC C T GAAAGT T TTAAT GC AG CT GT T CAGGAAAC AGAAGCT C C T T 

127617 

Qy 22 64 AT AT AT C CAT T G C GT GT GAT T T AAT TAAAGAAACAAAGCT CT C CACT GAGC CAAGT C CAG 2 323 

I I 1 I N I I I I I I I I I I I II I I II I I I I I I I I I I I I II I I I II M I I I I I II I I I I I I I I I 
Db 127618 ATATAT CCATT GC GT GTGAT TTAATTAAAGAAACAAAGCT CT C CACT GAGC CAAGT CCAG 

127677 

Qy 2324 ATTTCTCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGC 2383 

I I N I I I I I I I I I I M I I I I I I II I I II I I I I II II I I | | | | | | | | | | M | | || | | | | | | 

Db 12 7678 ATTTCTCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGC 

127737 

Qy 2384 TAGT GGAGGAT T C CT CAC C T GAAT CT GAAC CAGT T GACT TAT TTAGT GAT GAT TC GAT TC 24 4 3 

I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I | | | | M | | | I | | 
Db 12 7738 TAGT G GAG GATT C C T CAC C T GAAT CT GAAC C AGT T GACT TAT T TAGT GAT GAT T C GAT T C 

127797 



Qy 2444 C T GAAGT C C CAC AAAC AC AAGAGGAGG C T GT GAT GCT CAT GAAGGAGAGT C T CACT GAAG 2503 
I I I I M I I I I M I I I I I I I I I I I I I I I I I II I I M II 



Db 127 798 CT GAAGT CCCACAAACACAAGAGGAGGCT GT GATGCT CAT GAAGGAGAGT CT CACT GAAG 

127857 



Qy 2504 T GT CT GAGACAGTAGC C CAGCACAAAGAGGAGAGACT TAGT GCCT CACCT CAGGAGCTAG 2563 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M ! | | M | | | | | | | | | | | | | | | 

Db 127 85 8 T GT C T GAGAC AGT AG C C C AGCACAAAGAG GAGAGACT T AGT GCCT C AC CT CAGGAGCTAG 



127917 



Qy 2564 GAAAGC CAT AT T TAGAGT CT T T T CAGC C CAATT T ACAT AGT ACAAAAGAT G CT GC ATCT A 2623 

I I I I I I I I I I I I I I I II I M I I I I I I I M I I I I I I I I I I I I I I I I | | M | | | | | | | | | | | 
Db 127918 GAAAGC CAT AT T TAGAGT CT T TT CAGC C CAATT T AC AT AGT ACAAAAGAT GCT GC AT CT A 



127977 



Qy 2 624 AT GACATT CCAACATT GAC CAAAAAGGAGAAAATTTCTTT GCAAATGGAAGAGTTTAATA 2683 

I i I I I I I I I I M I II II II || I I I I I I I I I I I I I I I I I M | | | | | | | | M | | | | | | | | || 
Db 127978 AT GACATT CCAACATT GAC CAAAAAGGAGAAAATTTCTTT GCAAATGGAAGAGTT TAAT A 



128037 



2 68 4 C T GC AAT T T ATT C AAAT GAT GACT T ACT T T C T T CT AAGGAAGACAAAATAAAAGAAAGT G 274 3 

N I I I I I I I I I I I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I | | | | | | M II 

Db 128 038 CT GC AAT T T ATT CAAAT GAT GACT TACT T T CT T CT AAGGAAGACAAAATAAAAGAAAGT G 



128097 



Qy 2744 AAAC AT T TT CAGAT T C AT CT C C GAT T GAGATAAT AGAT GAATT T C C CAC GT TT GT CAGT G 2 803 

I I I I I I I I I I I I I I I I I I I I | || || | | | | M I I I I I M I I I II I I I 

Db 128 098 AAAC AT T T T CAGAT T C AT CT C C GAT T GAGATAAT AGAT GAATT T C C C AC GT T T GT CAGT G 



128157 



28 04 CT AAAGAT GAT T C T CCT AAAT T AGC CAAGGAGT AC AC T GAT CT AGAAGT AT C C GAC AAAA 2 8 63 
I I N I I I I I I I I I I I I || I I II I I I I I I I I I I | | | | | | M I I I I I I I I I I I I I || | | | | | 
Db 12 8158 CT AAAGAT GAT T C T CCT AAAT T AGC CAAGGAGT AC AC T GAT C T AGAAGT AT C C GAC AAAA 

128217 

Qy 2864 GTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTG 2923 

N I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | M M I I I I I I I I I I I M I II I I I I I II I 
Db 128218 GTGAAATTGCTAATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTG 

128277 

Qy 2 924 AC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAGT ACAT GT T T CAGAT GAATT C T C C G 2 983 

I M I I I I I I I I I I I I I I I I II I I I I I I I II I I II I I II II I I I II I I I I I I II I I I II I I 
Db 1282 78 AC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAGT ACAT GT T T CAGAT GAATT C T C C G 

128337 

Qy 2 984 AAAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCTTTGGAAC 3043 

N 1 I I I I I I I I II I I I I I I I I I I I II I I II I I I I I I I I I I I I I | | | | | | | | | | | M | | | | 
Db 128338 AAAAT AGGT C CAGT GTAT C T AAGGCAT C CAT AT C G C C T T CAAAT GT CT C T GCT TT GGAAC 

128397 

Qy 3044 CT CAGACAGAAAT GGGCAGCAT AGTTAAAT CCAAAT CACTTAC GAAAGAAGCAGAGAAAA 3103 

I I N I I I I I I I II II I I I I | | | | | | | | || | M | | | | | | | | | | | | | | | | | | | | | || | | | | | 
Db 12 8398 CT CAGACAGAAAT GGGCAGCAT AGTTAAAT CCAAAT CACTTAC GAAAGAAGCAGAGAAAA 

128457 

Qy 3104 AAC TTCCTTCT GAC ACAGAGAAAGAG GACAGAT C C C T GT C AGC T GTAT T GT C AGC AGAG C 3163 

1 I 1 I I I I I I I I f I I I I I | M I ! I I I I I I I I M I I I I I I I I i | | | ' | I I I I I I I I I I I I I I 
Db 12 8458 AACTTCCTTCTGACACAGAGAAAGAGGACAGATCCCTGTCAGCTGTATTGTCAGCAGAGC 

128517 



Qy 3164 T G AGT AAAAC T T C AGTT GT T GAC CT 3188 

I I II I I I I I M I I I I I I I || 
Db 128518 T GAGTAAAACT T C AG GT AAT AAT C T 128542 



RESULT 12 
AX195249 

LOCUS AX195249 4053 bp RNA linear PAT 28-AUG-2001 
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ACCESSION AX195249 

VERSION AX195249.1 GI: 15385809 

KEYWORDS 

SOURCE Homo sapiens (human) 

ORGANISM Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 
REFERENCE 1 

AUTHORS Strittmatter, S .M. 

TITLE Nogo receptor-mediated blockade of axonal growth 

JOURNAL Patent: WO 0151520-A 5 19-JUL-2001; 
YALE UNIVERSITY (US) 
FEATURES Location/Qualifiers 
source 1. .4053 

/organism="Homo sapiens" 
/mol_type="unassigned RNA" 
/db_xref="taxon: 9606" 
CDS 135. .3713 

/note="unnamed protein product; Human mRNA for Nogo 

protein (KIAA0886, GenBank Accession No. AB020693)" 

/ codon_start=l 

/protein_id="CAC59983. 1" 

/db_xref="GI: 15385810" 

/ db_x r e f = " REMT REMB L : CAC 5998 3" 

/ trans la tion="MEDLDQSPLVSSSDSPPRPQPAFKYQFVREPEDEEEEEEEEEED 
EDEDLEELEVLERKPAAGLSAAPVPTAPAAGAPLMDFGNDFVPPAPRGPLPAAPPVAP 
ERQPSWDPSPVSSTVPAPSPLSAAAVSPSKLPEDDEPPARPPPPPPASVSPQAEPVWT 
PPAPAPAAPPSTPAAPKRRGSSGSVDETLFALPAASEPVIRSSAENMDLKEQPGNTIS 
AGQEDFPSVLLETAASLPSLSPLSAASFKEHEYLGNLSTVLPTEGTLQENVSEASKEV 
SEKAKTLLIDRDLTEFSELEYSEMGSSFSVSPKAESAVIVANPREEIIVKNKDEEEKL 
VSNNILHNQQELPTALTKLVKEDEWSSEKAKDSFNEKRVAVEAPMREEYADFKPFER 
VWEVKDSKEDSDMLAAGGKIESNLESKVDKKCFADSLEQTNHEKDSESSNDDTSFPST 
PEGIKDRSGAYITCAPFNPAATESIATNIFPLLGDPTSENKTDEKKIEEKKAQIVTEK 
NTSTKTSNPFLVAAQDSETDYVTTDNLTKVTEEWANMPEGLTPDLVQEACESELNEV 
TGTKIAYETKMDLVQTSEVMQESLYPAAQLCPSFEESEATPSPVLPDIVMEAPLNSAV 
PSAGASVIQPSSSPLEASSVNYESIKHEPENPPPYEEAMSVSLKKVSGIKEEIKEPEN 
INAALQETEAPYISIACDLIKETKLSAEPAPDFSDYSEMAKVEQPVPDHSELVEDSSP 
DSEPVDLFSDDSIPDVPQKQDETVMLVKESLTETSFESMIEYENKEKLSALPPEGGKP 
YLES FKLSLDNTKDTLLPDEVSTLSKKEKI PLQMEELSTAVYSNDDLFI SKEAQI RET 
ETFSDSSPIEIIDEFPTLISSKTDSFSKLAREYTDLEVSHKSEIANAPDGAGSLPCTE 
LPHDLSLKNIQPKVEEKISFSDDFSKNGSATSKVLLLPPDVSALATQAEIESIVKPKV 
LVKEAEKKLPSDTEKEDRSPSAIFSAELSKTSWDLLYWRDIKKTGWFGASLFLLLS 
LTVFSIVSVTAYIALALLSVTISFRIYKGVIQAIQKSDEGHPFRAYLESEVAISEELV 
QKYSNSALGHVNCTIKELRRLFLVDDLVDSLKFAVLMWVFTYVG7VLFNGLTLLILALI 
SLFSVPVI YERHQAQIDHYLGLANKNVKD7\MAKIQAKIPGLKRKAE" 

ORIGIN 



Query Match 62.6%; Score 2343.6; DB 6; Length 4053; 

Best Local Similarity 81.3%.; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21 

13 4 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 
I II I I I I I I I I MUM | | | | | | | | | | | | | | | | | | | | || | | | || | | | | 

Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 75 

19 4 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 
I I I I I I II I I I M I I I II I I I II I I I I II I I I I I I I I I I I I I I I 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 253 AT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTC GT C C AC GGACAG C C C GC C C C GGC C T 312 

I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | | I I I I | | | 
Db 135 ATGGAAGACCTGGACCAGTCTCCTCTGGT— CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC GGAGC C C GAGGAC GAGGAG GAC GAGGAGGAG 372 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II II Ml 
Db 192 CAG C C C GC GT T CAAGTAC CAGT T C GT GAG GGAGCC CGAGGAC GAGGAG GAAGAAGAG 248 

Qy 373 GAGGAGGACGAGGAGGAGGAC GAC GAGGAC C TAGAGGAAC T GGAGGT GCT GGAGAGGAAG 4 32 

M M I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 249 GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 308 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I N I I I I I I M I I I I I I I I I I I I | I I I I I I I I I III 

Db 3 09 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

M : I I I I I I : I I I I I I I I I I I I I I I II I I I I I I I I I I I I | M I I I I I 
Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 42 8 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I I I I M I I I I I II I I I I I II I I I I I I I I 

Db 42 9 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 488 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I I I I I I I I I I I M I I I I I I 

Db 489 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 54 8 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M I I I I I I I I I I M I I I I I II II III III I I I I I I I I Ml Ml 
Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I M II M II Mill I II I M I II I II I I I M I 

Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 

I I I I I I II II II M II I || | | | | M I I I I I II I I M II I II 

Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 72 8 

Qy 8 08 GT GAT AC CCTCCTCT GC AGAAAAAAT TAT GGAT T T GAT GGAGCAGC C AGGTAACACT GT T 8 67 

I I I I I I I I I I I I M M I I I II I I I I I II I I II II II II II I I I I I II I II I II 
Db 729 GT GAT AC GCT C CT CT GCAGAAAA TAT GGACT T GAAGGAGCAGC CAGGTAACACTATT 785 



Qy 



868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 



IN M I I M I I I I I II I I I I I I I I I I I I | I I I I I I I | | | | | | | | | | M I I | M I I I I I 

Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

QY 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 98 7 

N I I I I I I I M I I I I I Mill! | | | | || | | M I I | | | | M | M I II III II 
Db 84 6 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 98 8 GT GT CAT C C T CAGAAGGAAC AAT TGAAGAAAC TT TAAAT GAAGCT T C TAAAGAGT T GC CA 1047 

II I I II I I I M I I I I I II I I I | | | | | | | | | | | | M I I I II 

Db 906 GTATTACCCACTGAAGGAACACTTCAAGAAAATGTCAGTGAAGCTTCTAAAGAGGTCTCA 965 

Qy 104 8 GAGAGGGCAACAAAT CCATTTGTAAATAGAGATTTAGCAGAATTTT CAGAATT AGAAT AT 1107 

I I I I I I I I I II M I I M I I I I I I II I I I I I I I I I I I I I II I I I II I I I I 

Db 966 GAGT^lGGCAAAAACTCTACTCATAGATAGAGATTTT^ACAGAGTTTTCAGAATTAGAATAC 1025 

Qy 1108 TCAGAAATGGGATCATCTTTTAAAGGCTCCCCAAAAGGAGAGTCAGCCATATTAGTAGAA 1167 

1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II I I III 1 1 1 1 1 1 1 III II III II 1 1 1 1 II I 

Db 1°26 T C AGAAAT G G GAT CAT C GT T CAGT GT C T CT C CAAAAG CAGAAT CT G C C GTAAT AGT AG CA 1085 

Qy 1168 AAC AC T AAGGAAGAAGT AAT T GT GAGGAGTAAA GACAAAGAGGAT TTAGTT T GTAGT 1224 

II I M I I I I I I I I II I I M I I I I II II I I I M I I II I I I I I | | 

Db 1086 AAT C C TAG G GAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1145 

Qy 1225 GC AGC C CT T C ACAGT C C ACAAGAAT CAC C T GT GGGTAAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 114 6 AAC AT C CT T CAT AAT CAACAAGAGT T AC C T AC AGCT CT T AC TAAAT T GGT TAAAGAGGAT 12 05 

Qy 1270 AGAGT T GT GT C T C CAGAAAAGACAAT GGAC AT T T T TAAT GAAAT GC AGAT GT C AGTAGT A 1329 

I I I I I I I I I I N Ml II I I I I I I II I I I I I I | I I I I | | 

Db 1206 GAAGT T GT GT C TT CAGAAAAAG C AAAAGAC AGT T T TAAT GAAAAGAGAGT T G CAGT G GAA 1265 

Qy 1330 GC AC C T GT GAG GGAAGAGT AT G C AGACT T TAAGC C AT T T GAACAAGC AT GGGAAGT G AAA 1389 

II HI I I I I I I I M I II I I I I I II I II | | | || | | | | || | | | | | | | | | | | || 
Db 1266 GC T C CT AT GAG GGAGGAAT AT G C AGACT T CAAAC CAT T TGAGC GAGT AT G GGAAGT GAAA 1325 

Qy 1390 GAT ACTT AT GAGGGAAGT AGGGAT GT GCT GGCT GCT AGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I I I I I I I I | | | | | | | || || 

Db 132 6 GATA GTAAGGAAGAT AGT GAT AT GT T GGCT GCT GGAGGTAAAAT C GAGAGCAACTT G 1382 

Qy 1438 GAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGGAAG 14 97 

MIM I I I M I I I I I I I I I I I I I I IN | || 

Db 13 83 GAAAGTAAAGT G GATAAAAAAT GT T T T G C AGAT AG C CT T GAG CAAAC TAAT CAC GAAAAA 14 42 

Qy 14 98 GAT AGT GAAG GC AGAAAT GAG GAT GCTTCTTTCCC CAGT AC C C CAGAAC CT GT GAAGGAC 1557 

I I I I I I I I I M Mill III I I I II I I I I I I I I I I I I I I I | | | | | | | | | 
Db 1443 GAT AGT GAGAGT AGTAAT GAT GAT ACT T CT TT C C C CAGT AC G C CAGAAGGT ATAAAGGAT 1502 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA— CCTCAGCAACCGAAAGCACCACA 1614 

III I M M I I I I I I I I I I I I I | | | | | | | | | | | | | | | I I I I I II 

Db 15 03 C GT T C AG GAGC AT AT AT CAC AT GT GCTCCCTT T AAC C CAGCAGCAAC T GAGAGC AT T GC A 1562 

Qy 1615 GCAAAC ACT T T C C CT T T GT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AG AT GAAAAAAAA 1674 

I MM I III | | | | || | || | | | || | | M | | | 

Db 1563 AC AAAC AT TTTTCCTTT GT TAG GAGAT C C T AC T T C AGAAAAT AAGAC C GAT GAAAAAAAA 1622 

Qy 1675 AT AGAAGAAAG GAAG G C C C AAAT T AT AAC AG AGAAG ACT AGC CCCAAAAC GT CAAAT 1731 

M I I M M I I I || | | | | | | || I I I I I M | II I I I I I I I I I I I M I I 



Db 


1623 


ATAGAAGAAAAGAAGGCCCAAATAGTAACAGAGAAGAATACTAGCACCAAAACATCAAAC 1682 


Qy 


1732 


CCTTTCCTT GT AGCAGT AC AG GAT T CT GAG G C AGAT T AT GT T ACAAC AGAT AC C T TAT CA 1791 

Mill II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I | | | | | | || | in M 

C CTTT T CTT GTAGCAGCACAGGATT CT GAGACAGAT TAT GT CACAACAGATAATTTAACA 1742 


Db 


1683 


Qy 


1792 


AAGGTGACTGAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATTTAGTTCAG 18 51 
M 1 1 1 1 1 1 1 1 1 1 1 II III 1 1 1 11 I I I I I | | | | | | | | | | | | | | | M | | | | | in 
AAGGT GACT GAGGAAGT C GT G G C AAACAT GC C T GAAG GC C T GAC T C CAGAT T T AGTAC AG 18 02 


Db 


1743 


Qy 


1852 


GAAG CAT GT GAAAGT GAACT GAAT GAAGC C ACAGGT ACAAAGAT T G CT T AT GAAACAAAA 1911 
N 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 I 1 1 1 
GAAG CAT GT GAAAGT GAAT T GAAT GAAGT TACT GGTACAAAGAT T G CT T AT GAAACAAAA 18 62 


Db 


1803 


Qy 


1912 


GT GGACTT GGT CCAAACAT CAGAAGCTATACAAGAAT CACTTTAC C C CACAGCACAGCTT 1971 

1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 II 1 1 1 1 1 M | M | | IMM || || 1 1 | | | | | | M 1 
AT GGACT T GGT T CAAACAT CAGAAGT TAT GCAAGAGT CACT C TAT C C T GC AGC AC AGCTT 1922 


Db 


1863 


Qy 


1972 


T G CC C AT CAT T T GAGGAAGCT GAAGC AACT C CGT C AC CAGT T T T G C C T GAT AT T GT TAT G 2 031 
1 1 1 1 1 1 1 1 1 1 1 I 1 1 || I I I | | | | | | | | | | | | | | | | | | | | | | || | | I | | | | | | | 
T GC C CATC AT T T GAAGAGT CAGAAGCTAC T C CT T C AC C AGT T T T GC CT GAC AT T GT T AT G 1982 


Db 


1923 


Qy 


2032 


GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 
1 M 1 1 1 1 1 1 1 1 MINI I I I | || 1 1 1 M 1 1 1 1 1 1 II 1 II I | | | | | | 
GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 2 042 


Db 


1983 


Qy 


2092 


T C C C CACT GGAAG CAC C T C CT C CAGT T AGT TAT GACAGT ATAAAGC T T GAGC CT GAAAAC 2151 

M 1 1 1 1 MM II M 1 1 M 1 1 1 II II 1 M IMM 1 II 1 1 1 1 1 1 II M 1 

T CAC CAT T AGAAG CTT CTT CAGT TAATTATGAAAGCATAAAACATGAGCCT GAAAAC 2099 


Db 


2043 


Qy 


2152 


C C C C CAC CAT AT GAAGAAGC C AT GAAT GT AG CAC T AAAAGCTTTGGGAACAAAGGAA 2208 

1 M 1 1 II II 1 1 II II II 1 1 1 1 1 II MM 1 1 1 1 Mill | || | | || M 1 II 

C C C C CAC CAT AT GAAGAGGC CAT GAGT GT AT C ACT AAAAAAAGTAT C AG GAATAAAGGAA 2159 


Db 


2100 


Qy 


2209 


G GAAT AAAAGAGCC T GAAAGT T T TAAT G C AGC T GT T C AG GAAAC AGAAGCT C CT TAT AT A 22 68 
1 1 M 1 M M II 1 1 1 II 1 1 1 1 1 II 1 II 1 1 1 MM 1 II II II II 1 1 II 1 1 1 M 1 1 1 
GAAAT T AAAGAGC CTGAAAAT AT TAAT GC AGCT CTT CAAGAAAC AGAAGCT C CT T AT AT A 2219 


Db 


2160 


Qy 


2269 


TCCATTGCGTGTGATTTAATTAAAGAAACAAAGCTCTCCACTGAGCCAAGTCCAGATTTC 232 8 
M M M 1 1 1 1 1 M 1 1 1 1 M II II 1 II 1 1 1 II II II II II III III II 1 1 M 
T C TAT T GC AT GT GATT TAAT TAAAGAAACAAAGC T T T C T GCT GAACC AG C T C C GGATT T C 227 9 


Db 


2220 


Qy 


2329 


TCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGTG 238 8 

1 1 1 M M 1 M II II II 1 II II 1 1 II II 1 Mill M II 1 II II II 1 1 1 

T CT GAT TAT T C AGAAAT GGCAAAAGT T GAAC AGC CAGT GC CT GAT C ATT CT GAG C TAGTT 2339 


Db 


2280 


Qy 

Db 


2389 
2340 


GAGGAT T C CT CAC CT GAATCT GAAC CAGT T GACT TAT T T AGT GAT GATT C GATT C C T GAA 244 8 

M 1 M II 1 II II II II 1 1 1 1 1 1 1 1 1 1 1 II 1 1 | | | || || || | | || || M II 1 1 II 1 

GAAGAT T C CT CAC C T GAT T CT GAAC C AGT T GAC T TAT T T AGT GAT GATT CAAT AC CT GAC 2399 


Qy 


2449 


GT C C C AC AAAC ACAAGAGGAGG CT GT GAT GCT CAT GAAGGAGAGT C T CACT GA A 2 5 02 
ii i i t t i i i i j f i i i it t i i i t i i i i i i i i i ii iii ti i 

M II 1 II 1 1 1 1 M II II II 1 II II 1 M MM II 1 II 1 1 II 1 1 II 1 

GTT CCACAAAAACAAGAT GAAACT GT GAT GCTT GT GAAAGAAAGT CT CACT GAGACTT CA 2459 


Db 


2400 


Qy 


2503 


GT GT C T GAGAC AGTAGC C CAGC ACAAAGAGGAGAGACT T AGT G C C T CAC CT C AGGAG C T A 2562 

1 1 1 1 1 1 Ml II II 1 1 II M 1 1 1 1 III II 1 1 
TTT GAGT CAAT GATAGAATATGAAAATAAGGAAAAACT CAGT GCTTT GC CACCT GAGGGA 2519 


Db 


2460 



Qy 


2563 


Db 


2520 


Qy 


2620 


Db 


2580 


Qy 


2680 


Db 


2640 


Qv 


2740 


Db 


2700 


Qv 


2800 


Db 


2760 


Qy 


2857 


Db 


2820 


Qv 


2917 


Db 


2880 


Qv 


2971 


Db 


2940 


Qy 


3031 


Db 


3000 


Qy 


3091 


Db 


3060 


Qy 


3151 


Db 


3120 


Qy 


3211 


Db 


3180 


Qy 


3271 


Db 


3240 


Qy 


3331 


Db 


3300 



GGAAAGC CAT AT T T AG AGT CT T T T C AG C C CAAT T T AC AT AGT ACAAAAGAT G C TGCA 2 619 

I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I M I I I I M I I I I I 

G GAAAG C CAT AT T T GGAAT C T T T TAAGC T C AGT T T AGAT AACACAAAAGAT AC C CT GT T A 2 57 9 

T CT AAT GAC AT T C CAAC AT T GAC CAAAAAGGAGAAAAT T T CT T T GCAAAT GGAAGAGT T T 267 9 

II I I M II I I II I I I I I I I II I I I I III I 

C C T GAT GAAGT T T CAAC AT T GAG C AAAAAG GAGAAAAT T C CT T T GC AGAT G GAGGAGCT C 2 639 

AATACTGCAATTTATTCAAATGATGACTTACTTTCTTCTAAGGAAGACAAAATAAAAGAA 2739 

I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I I I I MM II II 
AGTACTGCAGTTTATTCAAATGATGACTTATTTATTTCTAAGGAAGCACAGATAAGAGAA 2699 

AGT GAAAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AG AT G AAT T T C C C AC GT T T G T C 2799 

I I I II M II II M II II II II I II Mill II I II I I II I II II M II II 

ACT GAAAC GT T T T C AGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C C T AC AT T GAT C 2759 
AGT GCT AAAGAT GAT T C - - — T C CT AAAT T AGC CAAGGAGT AC ACT GAT C T AGAAGT AT C C 2856 

I M II I M Ml I II I I I M I M I I I II II I II M I II II II I 

AGTTCTAAAACT GATTCATTTT CTAAATTAGCCAGGGAATATACT GAC CT AGAAGT AT CC 2 819 
GACAAAAGT GAAAT T G CT AAT AT C CAAAGC G GGGC AGAT T CAT T GC CT T GCT T AGAATT G 2916 

II II I M II II I I I I I M M II II I II II II II I 

CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 2879 

C C CT GT GAC CT T T CT T T CAAGAAT AT AT AT C CT AAAGAT GAAG TACATGTTTCA 2970 

Ml I II I II II II M II II I II I I M I II I II II I I I I II I 



GATGAATTCTCCGAAAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTC 3030 

|| I II II II II II I II II I I II I I II I M I I I II II M M 

GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2 999 

TCTGCTTTG GAAC CT C AGAC AGAAAT GGGC AG CAT AGTT AAAT C CAAAT C ACT T AC GAAA 3090 

I I I I I I II I I MM M II II I M II I II I II II I II II III II II 

TCTGCTTT GGC C ACT C AAG CAGAGAT AGAGAGCAT AGTT AAACC CAAAGT T CTT GT GAAA 3059 
GAAGCAGAGAAAAAACTT CCTT CT GACACAGAGAAAGAGGACAGAT CCCT GT CAGCT GTA 3150 

Mill II I II M I II M I II II M II II I I I II II II I I M Ml II 

GAAG CT GAGAAAAAACT T C CT T C C GAT ACAGAAAAAGAG GAC AGAT C AC CAT CT G CT AT A 3119 
TT GT CAGCAGAG C T GAGTAAAACT T C AGT T GT T GAC CT C CT C TACT GGAGAGAC AT T AAG 3210 

|| | | I I II I II I II I I II I II I I II II I M I II I II II I I II II M II I II II II II I 

TT T T CAGCAGAG CT GAGT AAAACT T C AGT T GT T GAC CT C CT GT AC T GGAGAGAC AT T AAG 317 9 
AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

I M | I I I I I I I I I II I II II II II II I II II II I I I II II II II II II I MIMI 

AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3239 
ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

| I I I I M I II II I I M II I II II II I M I M I M M II II II I II II II M M I I 

flT'T^T'car^rf^TZx ap A(^rr.TAf!ATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3299 



| | I I I | I M I II I II II I II M II II II I I I I I II II II 1 I II II II II M M II II 



Qy 


3391 


Db 


3360 


Qy 


3451 


Db 


3420 


Qy 


3511 


Db 


3480 


Qy 


3571 


Db 


3540 


Qy 


3631 


Db 


3600 


Qy 


3691 


Db 


3660 



TAT T T AGAAT CT GAAGT T G C TAT AT C AGAGGAATT GGT T CAGAAAT ACAGT AAT T CT GC T 3450 

I M I I I I I I II I I I I I I I I II I I I Mill I I I I I I I I I I I I I I I I I I I I I I I I I I 

TAT C T GGAAT CT GAAGTT GCT ATAT C T GAGGAGTT GGT T CAGAAGT AC AGTAAT T CT GC T 3419 

CTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTA 3510 

M I I I I I I I I I I II I I I I I I II II I I I I I I I I I I II I I I I I I I II | I | | 

CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 3479 

GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 
M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I 
GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

AATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTAT 3630 
I I I I I I I I M I I I I I II I I II II Mill II I I II II I I II II II I I I II || I | M I 
AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

GAAC GG CAT C AG GT GC AGAT AGAT CAT TAT CT AGGACT T GCAAACAAGAGT GTT AAGGAT 3690 

I I I M I II I I I II I I I II I II I I II I II II II I II II II I I I MM II I I II III 

GAAC GGCAT CAGG C AC AGAT AGAT CAT TAT CT AG GACT T GCAAATAAGAAT GT TAAAGAT 3659 

GC CAT GG C CAAAAT C CAAGC AAAAAT C C CT G GAT T GAAG C GCAAAGC AGA 374 0 

M M I I I I M M I II II I II I II II II I I II I II II II II I II II II 

GC TAT GGCT AAAAT C CAAGCAAAAAT C C C T G GAT T GAAGC GCAAAGCT GA 3709 
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AB020693 4053 bp mRNA linear PRI 16-JUN-1999 

Homo sapiens mRNA for KIAA0886 protein, complete cds . 

AB020693 

AB02 0693. 1 GI : 42402 60 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (sites) 

Nagase,T., Ishikawa,K., Suyama,M., Kikuno,R., Hirosawa,M., 
Miyajima,N., Tanaka,A., Kotani,H., Nomura, N. and Ohara f O. 

Prediction of the coding sequences of unidentified human genes. 

XII. The complete sequences of 100 new cDNA clones from brain which 

code for large proteins in vitro 

DNA Res. 5 (6), 355-364 (1998) 

99156230 

10048485 

2 (bases 1 to 4053) 

Ohara,0., Suyama,M., Kikuno,R., Nagase,T. and Ishikawa,K. 
Direct Submission 

Submitted ( 02-DEC-1998 ) Osamu Ohara, Kazusa DNA Research Institute, 
Laboratory of DNA Technology; Yana 1532-3, Kisarazu, Chiba 
292-0812, Japan (E-mail : cdnainf o@kazusa . or . jp, Tel : +81-438-52-3913, 
Fax:+81-438-52-3914) 

Location/Qualifiers 

1. .4053 

/organism="Homo sapiens" 

/mol_type="mRNA n 

/db xref="taxon: 9606" 



/clone="hk07722" 

/ sex="male" 

/tissue_type="brain n 

/clone_lib="pBluescriptII SK plus" 

/dev_stage= "adult" 
gene 1. .4053 

/gene="KIAA0886" 
CDS 135. .3713 

/gene="KIAA0886" 

/ codon_start=l 

/product="KIAA0886 protein" 

/protein_id="BAA74909. 1" 

/db_xref="GI : 4240261" 

/ trans 1 at ion="MEDLDQSPLVSSSDSPPRPQPAFKYQFVREPEDEEEEEEEEEED 
EDEDLEEL E VL E RK P AAG L S AAP VP T AP AAGAP LMD F GN D FVP PAP RG P L P AAP P VAP 
ERQPSWDPSPVSSTVPAPSPLSAAAVSPSKLPEDDEPPARPPPPPPASVSPQAEPVWT 
PPAPAPAAPPSTPAAPKRRGSSGSVDETLFALPAASEPVIRSSAENMDLKEQPGNTIS 
AGQEDFPSVLLETAASLPSLSPLSAASFKEHEYLGNLSTVLPTEGTLQENVSEASKEV 
SEKAKTLLIDRDLTEFSELEYSEMGSSFSVSPKAESAVIVANPREEIIVKNKDEEEKL 
VSNNILHNQQELPTALTKLVKEDEWSSEKAKDSFNEKRVAVEAPMREEYADFKPFER 
WEVKDSKEDSDMLAAGGKIESNLESKVDKKCFADSLEQTNHEKDSESSNDDTSFPST 
PEGIKDRSGAYITCAPFNPAATESIATNIFPLLGDPTSENKTDEKKIEEKKAQIVTEK 
NTSTKTSNPFLVAAQDSETDYVTTDNLTKVTEEWANMPEGLTPDLVQEACESELNEV 
TGTKIAYETKMDLVQTSEVMQESLYPAAQLCPSFEESEATPSPVLPDIVMEAPLNSAV 
PSAGASVIQPSSS PLEAS S VN YES IKHEPENPPPYEEAMSVSLKKVSGIKEEIKEPEN 
INAALQETEAPYISIACDLIKETKLSAEPAPDFSDYSEMAKVEQPVPDHSELVEDSSP 
DSEPVDLFSDDSIPDVPQKQDETVMLVKESLTETSFESMIEYENKEKLSALPPEGGKP 
YLESFKLSLDNTKDTLLPDEVSTLSKKEKIPLQMEELSTAVYSNDDLFISKEAQIRET 
ETFSDSSPIEIIDEFPTLISSKTDSFSKLAREYTDLEVSHKSEIANAPDGAGSLPCTE 
LPHDLSLKNIQPKVEEKISFSDDFSKNGSATSKVLLLPPDVSALATQAEIESIVKPKV 
LVKEAEKKLPSDTEKEDRSPSAIFSAELSKTSWDLLYWRDIKKTGWFGASLFLLLS 
LTVFSIVSVTAYIALALLSVTISFRIYKGVIQAIQKSDEGHPFRAYLESEVAISEELV 
QKYSNSALGHWCTIKELRRLFLVDDLVTiSLKFAVLMWFTYVGALFNGLTLLIIAIjI 
S LFS VPVI YERHQAQI DH YLGLANKNVKDAMAKIQAKI PGLKRKAE " 

ORIGIN 

Query Match 62.6%; Score 2343.6; DB 9; Length 4053; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21; 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I M I M I I Ml I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I 
Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCC7\ACCCCCACAACCGCCCGCGGCT 75 

Qy 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I M I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II II I I I I I I II 
Db 135 AT GGAAGAC CT GGAC CAGT CT C CT CT GGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC G GAGC C C GAG GAC GAGGAGGAC GAGGAGGAG 372 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II III 
Db 192 C AG C C C G C GT T CAAGT AC CAGT T C GT GAG GGAGC C C GAGGAC GAG GAG GAAGAAGAG 24 8 



Qy 



373 GAGGAGGAC GAGGAGGAGGACGAC GAGGAC CTAGAGGAACT GGAGGT GCT GGAGAGGAAG 432 



Db 249 GAG GAGGAAGAGGAGGAC GAG GAC GAAGAC CT G GAG GAG C T GGAGGT G C T GGAGAGGAAG 3 08 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I I M II I I I 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 487 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

I I I I I I I I I I I I I I I I I II II I I II I I I I I II I I I II I I I I I I III 
Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 42 8 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 429 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 4 88 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I II I I I I I II I I I I I I I I I I M 
Db 489 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 54 8 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I I II I I I I II I I I II II II III III I I I I I I I I III III 
Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 

I I I I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I || I I I I I I | 
Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 72 8 

Qy 8 08 GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGATT T GAT GGAG CAGC C AGGTAACAC T GT T 8 67 

I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I II I II I I II 

Db 729 GTGATACGCTCCTCTGCAGAAAA TAT G GACT T GAAG GAGCAG C C AGGT AAC ACT AT T 7 85 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II 
Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I II I II III II 

Db 846 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 98 8 GT GT CAT C CT C AGAAGGAACAATT GAAGAAACT T T AAAT GAAGCT T C TAAAGAGT T GC C A 104 7 

II I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 906 GT AT T AC C C ACT GAAG GAAC AC TT CAAGAAAAT GT CAGT GAAGCT T CT AAAGAGGT CT C A 9 65 

Qy 1048 GAGAGGGCAACAAAT C CAT T T GT AAAT AG AGAT T TAG C AGAAT T T T C AGAATT AGAAT AT 1107 

I I I I I I I I I II II I I II I II I I I I I II I I I I I I I I I I I I I I II I I I I I I 
Db 966 GAGAAG GCAAAAACT CT ACT CAT AGAT AGAGAT T T AAC AGAGT T T T C AGAATT AGAAT AC 1025 

Qy 1108 T C AGAAAT GGGAT CAT CT T T T AAAG GC T C C C CAAAAGGAGAGT C AG C CAT AT T AGTAGAA 1167 

I I I I I I I M I I I I I I I I 11 I I III I I I I I I I III II Ml II I I I I I I I 
Db 102 6 T C AGAAAT GGGAT CAT C GT T CAGT GT C T CT C CAAAAGC AGAAT CT G C C GTAAT AGT AGC A 108 5 

Qy 1168 AACACTAAGGAAGAAGTAATT GT GAGGAGTAAA G ACAAAGAGGAT TT AGT T T GT AGT 1224 

M I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I II I I M I I 



Db 10 86 AATCCTAGGGAAGAAATAAT CGT GAAAAATAAAGAT GAAGAAGAGAAGTTAGTTAGTAAT 114 5 

Qy 1225 G CAG C C C T T C AC AGT C C ACAAGAAT C AC CT GTGGGTAAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 114 6 AACAT C CT T CATAAT CAACAAGAGTT ACCT ACAGCT CTTACTAAATT GGTTAAAGAGGAT 12 05 

Qy 127 0 AGAGT T GT GT C T C C AGAAAAGACAAT GGAC AT T T T TAAT GAAAT G CAGAT GT C AGT AGT A 1329 

I I I II I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I 

Db 1206 GAAGT T GT GT CTT CAGAAAAAG CAAAAGAC AGT T T TAAT GAAAAGAGAGT T G CAGT GGAA 1265 

Qy 1330 GC AC C T GT GAGGGAAGAGT AT GCAGACT T TAAG C CAT T T GAACAAGC AT GGGAAGT GAAA 1389 

II III I II I II I II I I I I I I I I I I I II I I I I I I I I I II I II I II I I I I I I I 
Db 12 66 GCT C CT AT GAG GGAG GAAT AT GCAGAC T T CAAAC CAT T T GAG C GAG TAT GGGAAGT GAAA 1325 

Qy 1390 GAT ACT TAT GAGG GAAGT AG GGAT GTGCTGGCT GCT AGAG CT AATGTG 1437 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I II II 

Db 1326 GATA GT AAGGAAGAT AGT GAT AT GT TGGCTGCT GGAGGT AAAAT CGAGAG CAACT T G 1382 

Qy 1438 GAAAGTAAAGT GG AC AGAAAAT GCT T G GAAGAT AG C C T G GAGCAAAAAAGT CTT G GGAAG 1497 

I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I II I I III I II 
Db 1383 GAAAGTAAAGT GGAT AAAAAAT GT T T T GC AGAT AGC CT T GAGCAAACT AAT C AC GAAAAA 1442 

Qy 14 98 GAT AGT GAAGGCAGAAAT GAGGAT GCTT CTTT C CCCAGTACCC CAGAACCT GT GAAGGAC 1557 

I I I I I I I I I II I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1443 GAT AGT GAGAGT AGTAAT GAT GAT ACT T CTT T C C C CAGT AC G C C AGAAGGT AT AAAGG AT 1502 

Qy 1558 AG CT C CAGAG CAT AT AT T AC CT GT GCT T C CT T T A CCT CAGCAACCGAAAGCAC CACA 1614 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1503 C GT T CAG GAGC AT AT AT C AC AT GTGCTCCCTT T AAC C C AGCAG CAACT GAGAGC AT T GCA 1562 

Qy 1615 GCAAACACT T T CC CTTT GTTAGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAA 1674 

I I I I I I III I I I I I I I II I I I II I I I I II I I I I II I I I I II I I I I I I I I I I I I 
Db 1563 AC AAACAT TTTTCCTTT GTT AG GAGAT C CT ACT T CAGAAAATAAGAC CGAT GAAAAAAAA 1622 

Qy 1675 AT AG AAGAAAGG AAG G C C C AAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I 
Db 1623 AT AGAAGAAAAG AAG G C C C AAAT AGT AAC AGAGAAGAAT ACT AGC AC C AAAAC AT CAAAC 1682 

Qy 17 32 CCTTTCCTT GT AG C AGT ACAGGAT T C T GAGGC AGAT TAT GTT ACAACAGAT AC CT T AT C A 17 91 

I ! I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III II 
Db 1683 CCT TTT CTT GTAGCAGCACAGGATT CT GAGACAGATTATGT CACAACAGATAATTTAACA 1742 

Qy 1792 AAGGT GACT GAGG C AGC AGT GT CAAAC AT GC CT GAAG GT C T GAC GC CAGAT T T AGT TC AG 1851 

I I I I I I I I I I I I I II Ml I II I I I I I I II II I II Mill I I II I I II I II III 
Db 1743 AAGGT GACT GAGGAAGT CGT GG CAAAC AT GC CT GAAG GC C T GAC T C C AGATT T AGTAC AG 1802 

Qy 18 52 GAAGCAT GT GAAAGT GAAC T GAAT GAAGC C AC AG GT ACAAAGATT G CT T AT GAAACAAAA 1911 

I II I I I I M I I I I II I I I II II I I II I II I I I I M I I I I I I I II I I I I II I I II II 
Db 1803 GAAGCAT GT GAAAGT GAAT T GAAT GAAGT TACT G GT AC AAAGAT T GCT TAT GAAACAAAA 1862 

Qy 1912 GT GGACT T G GT C CAAAC AT CAGAAGCT AT ACAAGAAT C ACT T T AC C C C ACAG C AC AGCT T 1971 

I II I II II I I I I II I II I I II II I II Mill I I I I I II M I I I I I I II I I I 
Db 18 63 AT G GAC T T G GT T CAAAC AT CAGAAGT T AT GC AAGAGT C AC T CT AT C CT G CAG CAC AGCT T 1922 

Qy 1972 T G C C CAT CAT T T GAG GAAGCT GAAGCAACT C C GT CAC C AGT T T T GC C T GAT AT T GT T AT G 2031 

I I I I II II M I I II II I II I I I Mill II I II I I I I II I I II I I I I II I I I I I 
Db 1923 T G C C CAT CAT T T GAAGAGT CAGAAG CT ACT C CTT CAC C AGT T T T GC C T GAC AT T GT T AT G 19 82 



Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I I I I I I I II I I II I I I I I I I M II I I I I I I I I II I I 

Db 1983 GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 2 042 

Qy 2 092 T C C C CACT GGAAG C AC C T C C T C CAGT T AGT TAT GACAGT AT AAAGCTT GAG C CT GAAAAC 2151 

M 1 I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 2043 T C AC CAT T AGAAG CT T CTT CAGT TAAT TAT GAAAGCATAAAACAT GAGC C T GAAAAC 2099 

Qy 2152 C C C C C AC CAT AT GAAGAAGC CAT GAAT GT AG CACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I M I I I I I I II I I I I I I II I I I I I I I I I I I I I I II I I I I I I II I I M 
Db 2100 C C C C CAC CAT AT GAAGAG G C CAT GAGT GT AT C AC T AAAAAAAGT AT CAG GAATAAAGGAA 2159 

Qy 2209 GGAATAAAAGAGC C T GAAAGTT T TAAT GC AG CT GTT CAGGAAAC AGAAG C T C C TT AT AT A 22 68 

I III I II I I I I I I II I I I I I I I I II I I I I MM I I I I I II I I II I I I I I I I I II 

Db 2160 GAAATTAAAGAGCCT GAAAATATTAAT GCAGCT CTT CAAGAAACAGAAGCT C CTTATAT A 2219 

Qy 2269 T CCAT T G C GT GT GAT T TAAT T AAAGAAACAAAGC T CT C CAC T GAGC CAAGT C CAGAT T T C 2328 

II I I I I I I I I I I I I I I II II I I M I II II I I II M I I I I III III MINI 

Db 2220 T CT ATT G CAT GT GAT T TAAT TAAAGAAACAAAG CTT T C T GCT GAAC C AGCT C C GGAT T T C 2279 

Qy 2329 T CTAATT ATT CAGAAATAGCAAAAT T C GAGAAGT CGGT GC C CGAACACGCT GAGCTAGT G 2 38 8 

I II I I II I M I I II I I I II I I I I I I I I I I I I II I I I I I II I I I II I I 

Db 22 80 T CT GAT T ATT C AGAAAT GGCAAAAGT T GAAC AGC CAGT GC CT GAT CATT CT GAGCTAGT T 2339 

Qy 238 9 GAG GAT T C C T CAC C T GAAT C T GAAC CAGT T G AC T TAT T T AGT GAT GAT T C GAT T C C T GAA 244 8 

II I M II II I II II I I II I I I I I I II I II II I II II II I I I I I I II II II Mill 

Db 2340 GAAGAT T C CT C AC CT GAT T CT GAAC CAGT T GACT T ATT T AGT GAT GATT CAAT AC CT GAC 2399 

Qy 2 44 9 GT C C CACAAAC ACAAGAG GAG GC T GT GAT GCT CAT GAAGGAGAGT CT CAC T GA A 25 02 

I I I I II I I I I I I I I I I I I I II I II I II I I I I II I I II I I II II I I 
Db 2 400 GT T C CACAAAAACAAGAT GAAACT GT GAT GC T T GT GAAAGAAAGT CT CACT GAG ACTT C A 2459 

Qy 2503 GT GT C T GAGAC AGT AGC C C AGC ACAAAGAGGAGAGACT TAGT G C CT C AC CT C AGGAGC T A 2562 

I I III I III I II I I I I I II II I I III III I 

Db 2460 T T T GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT GCT T T GCCAC CT GAGGGA 2519 

Qy 2563 G GAAAG C CAT AT T T AGAGT CT T T T CAG C C CAAT T T AC AT AGT ACAAAAGAT G C TGCA 2619 

I I I I II II I M II I I I I M I I I I II II I I M I II II II M II I I I 
Db 2520 GGAAAGC C AT AT T T GGAAT C T T TT AAGCT CAGTT TAG AT AAC ACAAAAGAT AC C CT GT T A 2579 

Qy 2 62 0 T CTAAT GACAT T C CAAC AT T GAC CAAAAAGGAGAAAAT T T CT T T GCAAAT G GAAGAGT T T 2 67 9 

II II II II II I II II II I I II II II II I I II II I II II II Mill III I 

Db 2580 C CT GAT GAAGT T T CAACAT T GAGCAAAAAGGAGAAAAT T C CT T T G CAGAT GGAG GAGCT C 2639 

Qy 2 680 AAT ACT GCAAT T TAT T CAAAT GAT GACT TACT T T CTT CT AAGGAAGACAAAAT AAAAGAA 2739 

I I M I II I I II II I I I II I II I I II II I II I I I II I II II I I I II I I I II 
Db 264 0 AGT ACT G C AGT T TAT T CAAAT GAT GAC T TAT T T AT TT C T AAGGAAGC AC AGAT AAGAGAA 2 699 

Qy 2740 AGT GAAAC AT T T T CAGAT T CAT C T C C GAT T GAG AT AAT AG AT GAAT T T C C CAC G T T T G T C 2799 

I II II I I II I I II I II I I II I I I I Mill M I I I II II I II II II II II 
Db 2700 ACT GAAAC GT TTT CAGATT CAT CT CCAATT GAAATTAT AGAT GAGTT CCCT ACATTGAT C 2759 

Qy 2 80 0 AGT GC T AAAGAT GAT T C T C CT AAAT TAG C CAAGGAGT AC ACT GAT CT AGAAGT AT C C 2 856 

III II I I I I I I I I I I II II I II I I I II I II II I II II I I II I I I II I II 

Db 2760 AGT T C T AAAACT GAT T CAT TTT CT AAAT TAG C CAGGGAAT AT ACT GAC CT AGAAGT AT C C 2819 



Qy 2857 GACAAAAGT GAAAT T G CTAAT AT C CAAAG C GG GGCAGAT T CAT T GC C T T GCT T AGAAT T G 2916 

I I II I I I I I I I II I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I 

Db 2820 C ACAAAAGT GAAAT T GC T AAT GC C C C G GAT G GAG CT GGGT CAT T G C CT T GC AC AGAAT T G 2879 

Qy 2917 C C C T GT GAC CTTTCTTT C AAG AAT AT AT AT C C T AAAGAT GAAG TACATGTTTCA 2970 

III I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I 

Db 28 8 0 C C C CAT GAC CT T T C T T T GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT C AGT TT CT C A 293 9 

Qy 2971 GAT GAAT T CT C C GAAAAT AG GT C C AGT GT AT CT AAG G CAT C CAT AT C GC CT T CAAAT GT C 3030 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I | I I 

Db 294 0 GAT GACTTTTCTAAAAAT GGGT CTGCTACATCAAAGGT GCT CTT ATT GCCTCCAGATGTT 2999 

Qy 3031 TCTGCTTT GGAAC CT CAGAC AGAAAT G GGC AG CAT AGT T AAAT CCAAAT C AC TT AC GAAA 3090 

I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III MM 
Db 3000 TCTGCTTT G GC CAC T CAAGC AGAGAT AGAGAGCATAGT T AAAC C CAAAGT T CTT GT GAAA 305 9 

Qy 3091 GAAG C AGAGAAAAAACT T C CT T CT GAC AC AGAGAAAG AGGAC AG AT C C CT GT C AGCT GT A 3150 

I I I I I I I II I II I I I I I I I I I I II Mill II I I I I I I I II I I I I II III M 

Db 3060 GAAGC T GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAGGAC AGAT CAC CAT CT G CT AT A 3119 

Qy 3151 TT GT C AG C AGAGCT GAGTAAAACT T C AGT T GT T GAC CT C CT CT AC T G GAGAGAC AT T AAG 3210 

II II II II I I I I I I II M II I II M I I I I M II I I I II II II I I I II I I I II I I II I I 

Db 312 0 T T T T CAGC AGAGCT GAGT AAAAC T T C AGT T GT T GAC CT C CT GT ACT GGAGAGAC AT T AAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I I II I I I I II I I I I I I I II I I I I I I I I I II I I I I I I I I M II I II I I II MINI 

Db 3180 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3239 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

II I I I II I I I I I I I I I I I I I I I M I II I I I I I II II II I II II I II I I I I I I I M 

Db 324 0 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 32 9 9 

Qy 3331 AT AT AT AAGGGC GT GAT C C AGGC T AT C C AGAAAT CAGAT GAAGGC CAC C CAT T C AG GG CA 3390 

Mill I I II I I II I II I I I I I I I II I I I I M I I I II I II II II I II II M II II I I I 
Db 33 00 AT AT ACAAGGGT GT GAT C CAAGCT AT C CAGAAAT CAGAT GAAGGC CACCCATT CAGGGCA 3359 

Qy 3391 TAT T T AGAAT C T GAAGT T GCT AT AT C AGAGGAAT T GGT T CAGAAAT AC AGT AAT T CT G CT 3450 

III I I I I II I I I I I I I II I I II II Mill II II I I I I I II I I I I I I I I II I II I I 

Db 3360 TAT CT GGAATCT GAAGT T GCT AT AT CT GAGGAGTT GGTT CAGAAGT ACAGTAATTCT GCT 3419 

Qy 3451 CT T G GT CAT GT GAACAG CACAAT AAAAGAACT GAGG C G GCTT T T CT T AGT T GAT GAT T T A 3510 

I I I II II II I I I I II MM I I I I I I I I II II I II II I I II I II I II II I I I I I I 
Db 342 0 CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 3479 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I II I II I II I I II I I II II I I I I II I I II I I I II I I II I I I I II II I II M I 

Db 34 8 0 GTTGATTCTCTGAAGTTTGCAGTGTT GAT GTGGGTATTTACCTATGTT GGT GC CTT GTTT 3539 

Qy 3571 AAT G GT CT GAC AC T AC T GAT T T T AG CT CT GAT C T CAC T CT T C AGT AT T C CT GT TAT T TAT 3630 

II I I II I II II I I II I I II I I II Mill II I I II I I I II I I I II I II I II II I I I I 

Db 3540 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3631 GAAC G G CAT C AG GT GCAGAT AGAT CAT TAT C T AGGACT T GCAAACAAGAGT GT T AAG GAT 3690 

II I I M I II II II || || II II II I I II II II I I I II II M I I I I II I II I I I III 
Db 3600 GAAC G G CAT C AGGC ACAGAT AGAT CAT TAT C TAG GACT T GCAAATAAGAAT GT T AAAGAT 3659 

Qy 3691 GC CAT GGC CAAAAT C CAAG C AAAAAT C C CT GGAT T GAAG C GC AAAGC AGA 3740 



Db 



II I I I I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I I I II 
3660 GCTATGGCTAAAATCCAAGCAAAAATCCCTGGATTGAAGCGCAAAGCTGA 3709 
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AF148537 4632 bp mRNA linear PRI 09-SEP-2000 

Homo sapiens reticulon 4a mRNA, complete cds . 

AF148537 

AF148537. 1 GI : 10039550 

Homo sapiens (human) 
Homo sapiens 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

1 (bases 1 to 4632) 

Yang, J., Yu,L., Bi , A. D . and Zhao,S.Y. 

Assignment of the human reticulon 4 gene (RTN4) to chromosome 

2pl4 — >2pl3 by radiation hybrid mapping 

Cytogenet. Cell Genet. 88 (1-2), 101-102 (2000) 

20237542 

10773680 

2 (bases 1 to 4632) 
Zhou,Y., Yu,L. and Zhao,S.Y. 
Direct Submission 

Submitted ( 05-MAY-1999 ) Lab of Human Gene Research, Institute of 
Genetics, Fudan University, No. 220 Handan Rd., Shanghai 200433, 
P.R.China 

Location/Qualifiers 
1. .4632 

/organism="Homo sapiens" 
/mol_type="mRNA" 
/db_xref="taxon: 9606" 
142. .3720 
/note="RTN4a" 
/codon_start=l 
/product="reticulon 4a" 
/protein_id= "AAG12 17 6.1" 
/db_xref="GI : 10039551" 

/ trans lation="MEDLDQSPLVSSSDSPPRPQPAFKYQFVREPEDEEEEEEEEEED 
EDEDLEEL E VL E RK P AAG L S AAP VP T AP AAGAP LMD FGN D FVP PAP RG P L P AAP P VAP 
ERQPSWDPSPVSSTVPAPSPLSAAAVSPSKLPEDDEPPARPPPPPPASVSPQAEPVWT 
P PAP AP AAP P S T P AAP KRRGS S G S VD ET L FAL P AAS E P VI RS S AENMDLKEQ P GNT I S 
AGQEDFPSVLLETAASLPSLSPLSAASFKEHEYLGNLSTVLPTEGTLQENVSEASKEV 
SEKAKTLLIDRDLTEFSELEYSEMGSSFSVSPKAESAVIVANPREEIIVKNKDEEEKL 
VSNNILHNQQELPTALTKLVKEDEWSSEKAKDSFNEKRVAVEAPMREEYADFKPFER 
WEVKDSKEDSDMLAAGGKIESNLESKVDKKCFADSLEQTNHEKDSESSNDDTSFPST 
PEGIKDRSGAYITCAPFNPAATESIATNIFPLLGDPTSENKTDEKKIEEKKAQIVTEK 
NTSTKTSNPFLVAAQDSETDYVTTDNLTKVTEEWANMPEGLTPDLVQEACESELNEV 
TGTKIAYETKMDLVQTSEVMQESLYPAAQLCPSFEESEATPSPVLPDIVMEAPLNSAV 
PSAGASVIQPSSSPLEASSVNYESIKHEPENPPPYEEAMSVSLKKVSGIKEEIKEPEN 
INAALQETEAPYISIACDLIKETKLSAEPAPDFSDYSEMAKVEQPVPDHSELVEDSSP 
DSEPVDLFSDDSIPDVPQKQDETVMLVKESLTETSFESMIEYENKEKLSALPPEGGKP 
YLESFKLSLDNTKDTLLPDEVSTLSKKEKIPLQMEELSTAVYSNDDLFISKEAQIRET 
ETFSDSSPIEIIDEFPTLISSKTDSFSKLAREYTDLEVSHKSEIANAPDGAGSLPCTE 
LPHDLSLKNIQPKVEEKISFSDDFSKNGSATSKVLLLPPDVSALATQAEIESIVKPKV 
LVKEAEKKLPSDTEKEDRSPSAIFSAELSKTSWDLLYWRDIKKTGWFGASLFLLLS 



LTVFSIVSVTAYIALALLSVTISFRIYKGVIQAIQKSDEGHPFRAYLESEVAISEELV 
QKYSNSTaGHWCTIKELRRLFLVDDLVDSLKFAVLMWFTYVGALFNGLTLLIIALI 
SLFSVPVIYERHQAQIDHYLGLANKNVKDAMAKIQAKIPGLKRKAE" 

polyA_signal 4605. . 4610 

polyA_site 4622 
ORIGIN 

Query Match 62.6%; Score 2343.6; DB 9; Length 4632; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21; 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I M I I I I ! I I I M I I I I I I M I I I I I I I I I I I I I | | | | | | | || | M II 
Db 23 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 82 

Qy 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 83 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 141 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I M I I I I I I I I I I I I I I I I II I I I I I II I I I II I I M I I I I I I I I I I I 
Db 142 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 198 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAGGAG GAG 372 

I I M I I I II I I I I I II I I II I II I I I I I I I II I I I I I I I I I I I I I || || Ml 
Db 199 CAGC C C GC GTT CAAGT AC CAGT T C GT GAGG GAG C C C GAGGAC GAGGAG GAAGAAGAG 255 

Qy 373 GAGGAGGAC GAGGAGGAGGACGACGAGGACCT AGAGGAACTGGAGGTGCT GGAGAGGAAG 432 

M I I M I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 256 GAGGAGGAAGAGGAGGAC GAGGAC GAAGAC CT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 315 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I I I I M I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 316 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 375 

Qy 487 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 376 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 435 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 436 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 495 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I M I I I I I M I I I I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I 
Db 4 96 GCGCCATCCCCGCTGTCT GCT GCCGCAGTCTCGCCCTCCAAGCTCCCT GAGGAC GACGAG 555 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I I I I I I II II II I I I II II III III I I I I I I I I III III 
Db 556 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 615 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 616 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 675 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I I I I I I I I I I II I I I I I I I I I I I I II I I I I M I I | | | | | | | | | | | | | | | I I I I I I | 



Db 676 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 735 

Qy 8 08 GT GAT AC CCTCCTCT GC AGAAAAAAT TAT G GAT TT GAT G GAGC AGC C AGGT AAC AC T GTT 8 67 

I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I | I I I I I I I I I I I I I II 
Db 736 GT GAT AC GCT C CT C T G C AGAAAA TAT GGACTT GAAGGAGCAGC CAGGTAAC ACT ATT 7 92 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I I I I I I I M I I M I M I I I I II I I I I I I I I I I I I I I I I I I I II I I I I II I M I 
Db 793 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 852 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M I I I I I II I I I I I I I MINI I II I I II I I I I I I I I I I I I I I I II III II 
Db 853 CTGTCTCCTCTCTCAGCCGCTTCTTTC7\AAGAACATG7\ATACCTTGGTAATTTGTCAACA 912 

Qy 988 GTGTCATCCTCAGAAGGAACAATTGAAGAAACTTTAAATGAAGCTTCTAAAGAGTTGCCA 1047 

M I I M I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I | || 
Db 913 GTATTACCCACTGAAGGAACACTTCAAGAAAATGTCAGTGAAGCTTCTAAAGAGGTCTCA 972 

Qy 1048 GAGAGGGCAACAAAT C CAT T T GT AAAT AGAGAT T TAG C AGAAT T T T C AGAAT T AGAAT AT 1107 

INI I I I I I II II I I II II I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 973 GAGAAG G CAAAAACT CT AC T CAT AGAT AGAGAT TTAACAGAGT TT T C AGAAT T AGAAT AC 1032 

Qy 1108 T C AGAAAT GGGAT CAT CT T T TAAAGG CT C C C CAAAAGGAGAGT CAGC CAT AT T AGT AGAA 1167 

I I I I I M I I I I I I I I I I I I I I III I I I II I I III II III II I I I M I I 
Db 1033 T C AGAAAT G GGAT C AT CGT T C AGT GT C T C T C CAAAAGC AGAAT CT GC C GTAAT AGT AG C A 1092 

Qy 1168 AACACTAAGGAAGAAGTAATTGTGAGGAGTAAA GAC AAAGAGGAT T T AGT T T GT AGT 1224 

M I I I I I I I I I I M I I I I I I I I I i I II I I I I I I I I I I I I II I I 
Db 1093 AAT C CT AG G GAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1152 

Qy 1225 G CAGC C C T T C AC AGT C C ACAAGAAT C AC C T GTGGGTAAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I I I Ml MINI II 

Db 1153 AACAT C C T T CAT AAT CAACAAGAGT T AC CT AC AGCT CT T ACT AAATT G GT TAAAGAGGAT 1212 

Qy 1270 AGAGT T GT GT C T C C AGAAAAGACAAT GGAC AT T T T T AAT GAAAT GC AGAT GT C AG TAG T A 1329 

I I I I I I I I I I I I I I I I I III I I I I I I I I M I I I I I I I I I i I I 

Db 1213 GAAGT TGTGTCTT C AGAAAAAGC AAAAGAC AGT T T TAAT GAAAAGAGAGT T G C AGT GGAA 1272 

Qy 1330 GC AC C T GT GAGG GAAGAGT AT G C AGACT T TAAG C CAT T T GAACAAGC AT GGGAAGT GAAA 138 9 

M I M II I I I I I I I I I I I II I I I II I I I I || I I I I I || M I I I I 

Db 1273 GCT C CT AT GAG G GAGGAAT AT GC AGAC T T CAAAC CAT T T GAGC GAGT AT GG GAAGT GAAA 1332 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I || 

Db 1333 GATA GTAAGGAAGATAGTGATATGTTGGCTGCTGGAGGTAAAATCGAGAGCAACTTG 138 9 

Qy 1438 G AAAGT AAAG T G G AC AGAAAAT G C T T G GAAG AT AG C C T G GAG C AAAAAAGT C T T G G GAAG 1497 

M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I III I II 
Db 1390 GAAAGTAAAGTGGATAAAAAATGTTTTGCAGATAGCCTTGAGCAAACTAATCACGAAAAA 1449 

Qy 1498 GAT AGT GAAGG CAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGAC 1557 

I I I I I I I I I II I I I I I Ml I I I II I I I I I I II I II I I I I I I I I I I I II 
Db 1450 GAT AGT GAGAGT AGTAAT GAT GAT ACT T C T T T C C C CAGT AC GC CAGAAG GT AT AAAGGAT 150 9 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA C C T CAGCAAC C GAAAG C AC C AC A 1614 

Ml I M I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I I I II 
Db 1510 C GT T C AGGAGC AT AT AT CAC AT GTGCTCCCTT TAAC C C AG C AG CAAC T GAGAGC AT T GC A 1569 



Qy 1615 GCAAAC ACT TT C C CT T T GT T AGAAGAT CAT ACT T C AGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II II II I I I || I I I I II | | | | | | 
Db 1570 AC AAAC AT TTTTCCTTT GT T AG GAGAT C CT AC T T C AGAAAAT AAGAC C GAT GAAAAAAAA 162 9 

Qy 1675 AT AGAAGAAAG G AAG G C C C AAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

I I I I I I I I I I M I I I I I I I I I I I I II I I I I I I I MINI I 1 I I I I I Mill 
Db 1630 AT AGAAGAAAAGAAG G C C C AAAT AGT AAC AGAGAAGAAT AC TAG C AC C AAAAC AT C AAAC 1689 

Qy 1732 CCTTTCCTTGTAGCAGTACAGGATTCTGAGGCAGATTATGTTACAACAGATACCTTATCA 1791 

I M I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I I I I I I III II 
Db 1690 CCTTTTCTT GT AGC AGC AC AG GAT T CT GAGAC AGAT TAT GT C AC AAC AGAT AAT T T AAC A 1749 

Qy 17 92 AAGGTGACTGAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATTTAGTTCAG 18 51 

I I I I I I I I I I I I I M III I I I I II I I I M I I I I I I II I I I I I I I I I I I I I III 
Db 1750 AAGGTGACTGAGGAAGTCGTGGCAAACATGCCTGAAGGCCTGACTCCAGATTTAGTACAG 18 09 

Qy 1852 GAAG CAT GT GAAAGT GAAC T G AAT GAAGC C AC AG GT AC AAAGAT T GC T TAT G AAAC AAAA 1911 

I I I I M I I I I I I I I I I I I I I II I II I I II I I I || I I I | | | || | | | | | | || M M || 
Db 1810 GAAGCATGTGAAAGTGAATTGAATGAAGTTACTGGTACAAAGATTGCTTATGAAACAAAA 1869 

Qy 1912 GTGGACTTGGTCCAAACATCAGAAGCTATACAAGAATCACTTTACCCCACAGCACAGCTT 1971 

I I I I I I I I II M I I I I I I I I I I I I M I I I I I I I I I I II I I II I II I II I I I 
Db 1870 ATGGACTTGGTTCAAACATCAGAAGTTATGCAAGAGTCACTCTATCCTGCAGCACAGCTT 1929 

Qy 1972 TGCCCATCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGCCTGATATTGTTATG 2031 

I I M I I I I I I I I I I || I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 1930 TGCCCATCATTTGAAGAGTCAGAAGCTACTCCTTCACCAGTTTTGCCTGACATTGTTATG 198 9 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I I I I I I I I M I I I I I I I | | M I II I I I I I I I I I || | 

Db 19 90 GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 2 04 9 

Qy 2 092 TCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGCCTGAAAAC 2151 

M IN I II I I II II I I I I I I I I I I M || I | | | | | I | | | M I I I II I I 
Db 2050 T C AC CAT TAG AAG C T T C T T C AG T T AAT TAT GAAAG CAT AAAAC AT GAG C C T GAAAAC 2106 

Qy 2152 C C C CCACCATAT GAAGAAGCCAT GAAT GT AGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I I I M I I I I I I I I M I I I I I I I I I II I I I I Ml I I I I I I I I I 

Db 2107 C C C C C AC CAT AT GAAGAGG C CAT G AGT GT AT C ACT AAAAAAAGT AT C AG GAAT AAAGGAA 2166 

Qy 22 09 GGAATAAAAGAGCCTGAAAGTTTTAATGCAGCTGTTCAGGAAACAGAAGCTCCTTATATA 22 68 

I Ml I I I I I I I I I II I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2167 GAAAT T AAAGAG C CT GAAAAT AT T AAT GC AG C T CT T C AAGAAAC AGAAG CT C C T TAT AT A 2226 

Qy 22 69 TCCATTGCGTGTGATTTAATTAAAGAAACAAAGCTCTCCACTGAGCCAAGTCCAGATTTC 2328 

M I I I I I I I I I II I I I I I I I I I I || | | | | | | | | || | M I III III I I I I I I 
Db 2227 TCTATTGCATGTGATTTAATTAAAGAAACAAAGCTTTCTGCTGAACCAGCTCCGGATTTC 228 6 

Qy 232 9 TCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGTG 238 8 

I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2287 TCTGATTATTCAGAAATGGCAAAAGTTGAACAGCCAGTGCCTGATCATTCTGAGCTAGTT 2346 

Qy 238 9 GAGGATTCCTCACCTGAATCTGAACCAGTTGACTTATTTAGTGATGATTCGATTCCTGAA 24 4 8 

M I I I I M I I I I I I II || | | | | | | | | | | || | | | | | | | | | | | || | | | | | | | | | I II 
Db 2347 GAAGATTCCTCACCTGATTCTGAACCAGTTGACTTATTTAGTGATGATTCAATACCTGAC 2 4 06 



Qy 244 9 GT CCCACAAACACAAGAGGAGGCT GT GAT GCTCAT GAAGGAGAGT CT CACT GA A 2502 

I I I M I I I I I ! I I I I I I I I I I I I I I! I II I I II I I I I I I I I I I I I 
Db 2407 GTTCCACAAAAACAAGAT GAAACT GT GAT GCTT GT GAAAGAAAGT CT CACT GAGACTTCA 2466 

Qy 2503 GT GT CT GAGACAGTAGC CCAGCACAAAGAGGAGAGACTT AGT GCCT CAC CT CAGGAGCTA 2562 

I I Ml I III I I I I I I || I | | I | | Ml M | | 

Db 24 67 T T T GAGT CAAT GATAGAAT AT GAAAAT AAGGAAAAACT C AGT G C T T T G C CAC CT GAGGGA 252 6 

Qy 2563 GGAAAGCCAT AT T T AGAGT CT T T T C AGC C CAAT TT AC AT AGT AC AAAAGAT GC TGCA 2619 

I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I 
Db 2527 GGAAAG C CAT AT T T GGAAT CTT TT AAGC T C AGT TT AGATAAC ACAAAAGAT AC C C T GTT A 2586 

Qy 262 0 T CTAAT GACATT C CAACATT GACCAAAAAGGAGAAAATTT CTT TGCAAAT GGAAGAGTTT 2 679 

II I I I I M M I M I I II I I I I I I I I I I II I I I I I I I I I I I I I I II III I 

Db 2587 CCT GAT GAAGTTTCAACATT GAGCAAAAAGGAGAAAATT C CTTT GCAGAT GGAGGAGCT C 2 64 6 

Qy 2 68 0 AAT AC T G CAAT T TAT T CAAAT GAT GAC T T ACT T T CTT CTAAG GAAGACAAAATAAAAGAA 2739 

I I I I I I I I I I II I I I I I I I I II I I I I I I II I I I II I I I I II I | | II I I I I 
Db 2647 AGTACT GCAGTTTATT CAAAT GAT GACTT AT TTATTT CTAAGGAAGCACAGATAAGAGAA 2706 

Qy 2740 AGT GAAACAT TT T CAGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GT T T GT C 2799 

I I I I I I I I I I I I II I I I II I I I I I I I II I II I I I I I I I I II I I II || M 
Db 2707 ACT GAAAC GT TT T CAGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGTT C C CT AC AT T GAT C 2766 

Qy 2 8 00 AGT GCT AAAGAT GATT C T C CT AAAT TAG C CAAGGAGTACACT GAT CT AGAAGTAT C C 2856 

Ml Mill II II I I I I I I I II I M I II III II I I I II II II I I II I I I I 
Db 2767 AGT T CTAAAACT GATT CAT T T T CT AAAT T AGC C AGGGAAT AT ACT GAC C T AGAAGTAT C C 2826 

Qy 2857 GAC AAAAGT GAAAT T GCT AAT AT C CAAAGC G GG GCAGAT T CAT T G C CT T GCT T AGAAT T G 2916 

M II II I II II I I II I II II II II I I I I I M I II I II I I II I I I I I 

Db 2 827 CACAAAAGT GAAAT T GCTAAT G C C C C G GAT G GAG CT GG GT CAT T G C CT T G CAC AGAAT T G 28 86 

Qy 2917 C C C T GT GAC CTTTCTTT CAAGAAT AT AT AT C CTAAAGAT GAAG T AC AT GTT T C A 297 0 

Ml I I II I II I II I I I I II I I I I I I I I I I I I I I M I I I I I I 

Db 2 887 C C C CAT G AC CT T T CTT T GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 294 6 

Qy 2971 GAT GAAT T C T C C GAAAAT AGGT C CAGT GT AT CTAAG GCAT C CAT ATC G C CT T CAAAT GT C 3030 

M I I I I I I I M I I I M II I II I I I I I I I I I || | | | | | | M 

Db 2 947 GAT GACTTTTCTAAAAATGGGTCT GCT ACATCAAAGGT GCT CTTATT GCCT CCAGAT GTT 3006 

Qy 3031 TCTGCTTT GGAAC C T C AGAC AGAAAT GG GC AGC AT AGT T AAAT C CAAAT C ACTT AC GAAA 3090 

I I I I I I I I I I MM M II I I I M I I II I I I I || I || || Ml | | | | 
Db 3007 TCTGCTTTGGC CAC T CAAGC AGAGAT AGAGAGC AT AGT TAAAC C CAAAGT T CT T GT GAAA 3066 

Qy 3091 GAAGC AGAGAAAAAACT T C CTT CT GAC AC AGAGAAAGAGGAC AGAT C C CT GT CAGCT GT A 3150 

I I I I I II I I I II I I I I II II II II Mill I I I I I II II I I M I I II III II 
Db 3067 GAAGCT GAGAAAAAACTT CCTT CC GATAC AGAAAAAGAGGACAGATCAC CAT CT GCTATA 3126 

Qy 3151 T T GT C AG C AGAGC T GAGTAAAACT T CAGT T GTT GAC C T C C T CT AC T GG AGAGAC AT T AAG 3210 

M I II I I I II II I I I I II M II II II I I II I II I I I I I M I I II I II II I M II I II I 
Db 3127 T T T T C AG C AGAGCT GAGTAAAACT T C AGT T GT T GAC CT CCT GT ACT GGAGAGAC ATTAAG 318 6 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

M M M I I II II I II I I I I II M I II I II II I II I II II I II II II II I II I M I 
Db 3187 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 324 6 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 







1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 




Db 


3247 


ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 


3306 


Qy 


O O O 1 

3331 


AT AT AT AAGGG C GT GAT C CAGG CT AT C CAGAAAT CAGAT GAAGG C CAC C CAT T CAGGGCA 


3390 






M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 I M I 




Db 


3307 


ATATACAAGGGTGTGATCCAAGCTATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCA 


3366 


Qy 


3 33 1 


TATTTAGAAT CT GAAGTT GCT AT AT CAGAGGAATT GGTT CAGAAAT ACAGTAATT CT GCT 


3450 






IN 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 Mill 1 M 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


3367 


TAT C T G GAAT CT GAAGT T G CT AT AT CT GAGGAGT T G GT T C AGAAGT ACAGTAAT T C T GCT 


3426 


Qy 


3451 


C T T GGT CAT GT GAAC AGCACAAT AAAAGAACT GAGGC G GCT T T T CT T AGT T GAT GAT T T A 


3510 






M M 1 1 1 1 1 1 1 1 M 1 1 II 1 1 1 1 1 1 1 II 1 1 Mill II II II II II 1 1 M 1 II II 1 




Db 


3427 


CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 


3486 


Qy 


3511 


GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 


3570 






1 1 II II 1 1 II 1 1 1 1 1 Ml 1 II M 1 II 1 II 1 II 1 1 1 II 1 1 1 II M 1 II II II 1 II 1 1 




Db 


3487 


GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 


3546 


Qy 


3571 


AAT GGT CTGACAC TACT GATTTTAGCTCT GAT CTCACTCTTCAGT ATT CCTGTTATTTAT 


3630 






II II II M 1 M 1 1 1 II 1 1 1 M 1 1 1 II 1 1 II 1 II II 1 II 1 M 1 II II M II 1 II 1 II 

• ■'■■■■iiiiiiiiiiiiiiii i ■ ■ ■ ■ i» 1 1 1 1 1 1 1 1 l l l l llllllllllilli 




Db 


3547 


AAT GGT CTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTT CCTGTTATTTAT 


3606 


Qy 


3631 


GAAC GG CAT CAGGT GC AGAT AGAT CAT T AT CT AGGACTT GCAAACAAGAGT GT T AAGGAT 


3690 






1 II 1 1 1 M M M 1 1 II II II II 1 II 1 1 M II II 1 1 1 1 II II 1 MM II M 1 1 III 




Db 


3607 


GAAC GGCATCAGGCACAGAT AGAT CAT TAT CT AGGACTT GCAAATAAGAAT GTTAAAGAT 


3666 


Qy 


3691 


G C C AT GGC CAAAAT C CAAG CAAAAAT C C C T GGAT T GAAG CG CAAAGC AGA 3740 








M 1 II II II 1 1 1 II 1 1 II 1 1 II 1 II 1 II 1 M II 1 II II 1 1 II 1 1 1 II 




Db 


3667 


GCT AT GGCTAAAAT C CAAGCAAAAAT CC CT GGATT GAAGCGCAAAGCT GA 3716 
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Query Match 62.4%; Score 2333.2; DB 6; Length 4093; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 573; Indels 120; Gaps 22; 

CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 
I M II I I I I I I I I I I I I I I I I I I I I M I I I I I ! I I I I I I I I I I I I | | | 
CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 92 

CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 2 52 
I I I I I I M I I I I I I I I I I I I I I I I I I I I M I I I I II II II I I I I 
CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 151 

ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 
I I I M I I I I I I I I I I I I I I I I I I I I I I I I I | | | || | M | | | I | | | | | | | 
ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 208 

CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAG C C C GAGGAC GAGGAGGAC GAGGAGGAG 372 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | || M III 
CAGCCCGCGTTCAAGTACCAGTTCGTGAGGGAGCCCGAGGACGAGGAG GAAGAAGAG 2 65 

GAGGAGGAC GAG GAGGAGGAC GAC GAGGACCTAGAGGAACT GGAGGT GCT GGAGAGGAAG 4 32 

I M I I I II MINIM M I I I I I Mill I I I II II I II I I II II I I I II I II II 

GAG GAGGAAGAGGAG GAC GAG GAC GAAGAC C T GGAGGAGCT GGAGGT GCT GGAGAGGAAG 325 

CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 8 6 

M I M M II I M II I II I I I I I I II I I I I I || I II I II I I II I I I 

CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 385 

CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

M I I II I I I I MM II I II I I I I || I I II I I I I I I I II I || I I I I I I 
ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 44 5 

CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I M I I I II II I I I I I II I I I I I I II II I I I I I I I I 

CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 505 

GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 
I I I I I I I I I I II I I | || I I I II I I II II II II II I II II II I I II I II II I I I 
GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 565 



Qv 


134 


Db 


33 


Qy 


194 


Db 


93 


Qy 


253 


Db 


152 


Qy 


313 


Db 


209 


Qy 


373 


Db 


266 


Qy 


433 


Db 


326 


Qy 


487 


Db 


386 


Qy 


547 


Db 


446 


Qy 


598 


Db 


506 



Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I II I I I I I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 566 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 625 

QY 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 62 6 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 685 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I I I I I M I I I I I I I I I M M I I I I I I I I I I I I I I I I I I M II I I I I I I M I I I I I I 
Db 68 6 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 745 

Qy 808 GT GAT AC CCTCCTCTG C AGAAAAAATT AT GGAT T T GAT GGAG CAG C C AGGTAAC ACT GT T 867 

I I I I I I I I I I I I I I I I I I I M I MINI I I I I I I I I I I I M M || M | | | | l| 
Db 746 GT GAT AC GC T C CT CT G C AGAAAA TATGGACTTGAAGGAGCAGCCAGGTAACACTATT 802 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

M I I I I I I I I I I II I I I I I I I I I I II I | | | | || | | | | | | | | | | | | | I I I | | | | | M I I 
Db 803 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 862 

Qy 92 8 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M I I I I M I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I II IN | | 
Db 8 63 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 922 

Qy 988 GT GT CAT C CT CAGAAGGAACAAT T GAAGAAAC T T TAAAT GAAGCT T CT AAAGAGTT G C C A 1047 

M I I II I I II I I I I I I II I I I I I I I I I | | | | | | | | | M | | | M | || 
Db 923 GT AT T AC C C AC T GAAG GAAC ACT T C AAGAAAAT GT C AGT GAAG CT T CT AAAGAGGT CT C A 982 

Qy 104 8 GAGAGGG CAACAAAT C CAT T T GT AAAT AGAGAT T TAGC AGAAT T T T C AGAAT T AGAAT AT 1107 

I I I I I I I I I II II I I || || | | || | | | || MM II II II M I I I I I II I I 
Db 9 83 GAGAAG GCAA- AACT CT ACT CAT AGAT AGAGAT T TAAC AGAGT T T T CAGAATT AGAAT AC 1041 

Qy 1108 T C AGAAAT GGGAT CAT CT T T T AAAGG C T C C C CAAAAGGAGAGT C AGC CAT AT T AGT AGAA 1167 

I I I I I I I I I I M II I I I I I I I I M II III II III II M M I I I 

Db 1042 T C AGAAAT G G GAT CAT C GT T C AGT GT C T C T C CAAAAG C AGAAT CT GC CGT AAT AGT AGC A 1101 

Qy 1168 AACACTAAGGAAGAAGTAATTGTGAGGAGTAAA GACAAAGAGGATT T AGT T T GT AGT 1224 

I I I I II II I I III I M I I II I II 

Db 1102 AAT C CT AG G GAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1161 

Qy 1225 G CAG C C CTT C AC AGT C C ACAAGAAT C AC CT GT GGGTAAAGAAGAC 1269 

I I I I I I I I I I I II II I I I II I M I I I I I I I I 

Db 1162 AACAT C C T T CAT AAT CAACAAGAGT T AC CT AC AG CT C T TACT AAAT T GGT T AAAGAGGAT 1221 

Qy 12 70 AGAGTT GT GT CT CCAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT GT CAGTAGTA 132 9 

I I M I II II I I I I II I I Ml I II I I I I II I I II I I I I M M I I 

Db 1222 GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTT GCAGT GGAA 12 81 

Qy 1330 G C AC CT GT GAG GGAAGAGT AT GCAGAC T T T AAGC CAT T T GAACAAG CAT GGGAAGT GAAA 138 9 

M I I I I I I I I I I M I I I II II II I || || | || I I M I I II II II I 

Db 1282 GCT C CT AT GAGG GAGGAAT AT G C AGACT T CAAAC CAT T T GAGC GAGT AT GGGAAGT GAAA 1341 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

M I I I I I I I I I I I I I I I I I II II II I II I || II 

Db 1342 GATA GT AAG GAAGAT AGT GAT AT GT T G GCT GCT GGAG GT AAAAT C GAGAGCAAC T T G 1398 

Qy 1438 GAAAGT AAAGT GGAC AGAAAAT GC T T G GAAGAT AGC C T G GAG C AAAAAAGT CTT G GGAAG 1497 



Db 1399 GAAAGT AAAGT GGATAAAAAAT GT T TT G CAGATAG C C T T GAG C AAAC TAAT C AC GAAAAA 1458 

Qy 1498 GAT AGT GAAG GC AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C C AGAAC C T GT GAAG GAC 1557 

I I I I I I I I I II I I I I I III I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 1459 GAT AGT GAGAGT AGTAAT GAT GATACT T CTTT C CCCAGTACGC CAGAAGGTATAAAGGAT 1518 

Qy 1558 AGC T C C AGAG CAT AT AT T AC CT GTGCTTCCTT T A C C T C AGCAAC C GAAAGC AC C ACA 1614 

III I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II II I I I I I I 
Db 1519 C GT T C AGGAGCAT AT AT C AC AT GTGCTCCCTT TAAC C CAGC AGC AAC T GAGAG CAT T GC A 157 8 

Qy 1615 G C AAAC AC TTTCCCTTT GT T AGAAGAT CAT ACT T C AGAAAAT AAAAC AGAT GAAAAAAAA 1674 

II I I I I III I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 1579 ACAAAC AT TTTTCCTTT GT T AGGAGAT C CTACTT C AGAAAAT AAGAC C GAT GAAAAAAAA 1638 

Qy 1675 ATAGAAGAAAGGAAGGCCCAAATTATAACAGAGAAG ACTAGCCCCAAAACGTCAAAT 1731 

M I I I I I I I I I I I I It I I I I I I I I II I I I I I I I I I I I I I I I I I I I I Mill 
Db 1639 AT AGAAGAAAAGAAGG C C C AAAT AGT AACAGAGAAGAAT AC T AGC AC CAAAAC AT CAAAC 1698 

Qy 1732 CCTTTCCTT GTAG CAGT AC AG GAT T CT GAG GC AGAT TAT GT T AC AAC AGAT AC C T TAT C A 17 91 

I I I I I II I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I III II 
Db 1699 CCTTTTCTT GTAG C AG C AC AG GAT T C T GAGAC AGAT TAT GT C AC AACAGAT AAT T TAAC A 1758 

Qy 1792 AAG GT GAC T GAGGC AG CAGT GT CAAAC AT GC CT GAAGGT CT GAC GC C AGATT T AGT T C AG 1851 

I I I I I I I I M I I I II III I I I I I I I II II I I I I I I I I I I I I I I I I I I II I Ml 
Db 1759 AAGGT GACT GAGGAAGT C GT GGCAAAC AT GC C T GAAGG C C T GACT C C AGAT T T AGT AC AG 1818 

Qy 1852 GAAGC AT GT GAAAGT GAACT GAAT GAAGC C AC AGGT ACAAAGAT T GCT T AT GAAACAAAA 1911 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I I I I I I I I I I I I 
Db 1819 GAAG CAT GT GAAAGT GAAT T GAAT GAAGT T ACT GGTACAAAGAT T GCT TAT GAAACAAAA 18 7 8 

Qy 1912 GT G GACT T G GT C CAAAC AT C AGAAG CT AT ACAAGAAT C AC T T T AC C CC ACAGC ACAGCT T 1971 

I I I II I I I I I I M I I I I I I I I I I III Mill I I II I II II I I II I I I I II I 
Db 1879 AT G GACT T G GT T CAAAC AT C AGAAGT T AT G CAAGAGT C ACT CT AT C C T GC AG C AC AGCT T 1938 

Qy 1972 T GC C CAT C ATTT GAGGAAG C T GAAG CAAC T C C GT C AC CAGTT T T GC C T GAT AT T GT TAT G 2031 

I I I I I I I I I I I I II II I I I I I I II I I I I I I I I II II I I I I I II I I I II I I II I 

Db 1939 T GC C CAT CAT T T GAAG AGT C AGAAGC T ACT C CT T C AC CAGT TT T GC CT GAC AT T GT TAT G 1998 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

II I I I I I II II I I I I II M II I I I M I I I I I II I II I I I I II I II I 

Db 1999 GAAGCACCATTGAATTCTGCAGTTCCTAGT GCT GGTGCTTCCGT GAT ACAGC CCAGCTCA 2058 

Qy 2092 T CC C C ACT G GAAGC AC CT C CTCCAGTT AGTT AT GACAGTATAAAGCT T GAGC CTGAAAAC 2151 

II Ml I I I I I II II I I II II I I I II I II I II I I I II I I I I II I I II I 
Db 2059 T C AC CAT TAG AAG CT T CTT CAGT T AAT TAT GAAAGCAT AAAAC AT GAG C C T GAAAAC 2115 

Qy 2152 C C C C CAC CAT AT GAAGAAG C CAT GAAT GTAG CACT AAAAGCTTTGGGAACAAAGGAA 2208 

II I I II I I M I I I I II I II I I I I I I I I I I I I I II I II I II I I I I I II I I 
Db 2116 C CC C CACCATAT GAAGAGGC CAT GAGT GTAT CACTAAAAAAAGTAT C AGGAATAAAGGAA 2175 

Qy 2209 GGAATAAAAGAGC C T GAAAGT T T TAAT G C AGC T GTT C AG GAAAC AGAAGC T C C T TAT AT A 2268 

I III I II II I II II I I I I I I I I II I I I II I I I I II I II II II I II II I I I I I II 

Db 2176 GAAAT T AAAGAG C C T GAAAAT AT TAAT G C AGC T CTT CAAGAAAC AGAAGC T C CT TAT AT A 2235 

Qy 2269 T CC AT T G C GT GT GAT TTAAT T AAAGAAAC AAAGCT CT C CACT GAG C C AAGT C C AGAT T T C 2328 

II II II I II I II I I II I I I I II I I II II I I I II II MM III III II I M I 



Db 2236 T C T ATT GC AT GT GAT T TAAT TAAAGAAACAAAGC T T T CT G C T GAAC C AGCT C C GGATT T C 2295 

Qy 232 9 T C TAAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT C G GT GC C C GAAC AC G CT GAGCT AGT G 23 8 8 

I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I II I I I I I I | I I I I | I | 
Db 2296 T CT GAT TAT T C AGAAAT GGCAAAAGTT GAACAG C C AGT GC CT GAT CAT T CT GAGC T AGTT 2355 

Qy 2389 GAGGAT T C C T C AC CT GAAT C T GAAC C AGT T GAC TT AT TT AGT GAT GAT T C GAT T C CT GAA 244 8 

M M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I Mill 
Db 2356 GAAGAT T C CT CAC C T GAT T C T GAAC CAGT T GACTT AT T T AGT GAT GAT T CAAT AC CT GAC 2415 

Qy 244 9 GTCCCACAAACACAAGAGGAGGCTGTGATGCTCATGAAGGAGAGTCTCACTGA A 2502 

M II I I I I I I I I I I I I I I I I || I I M I I II I I I I I I I I I I I I | | | 
Db 2416 GTTCCACAAAAACAAGATGAAACTGTGATGCTTGTGAAAGAAAGTCTCACTGAGACTTCA 2475 

Qy 2503 GT GT CT GAGAC AGT AG C C CAGC ACAAAGAGGAGAGAC TT AGT GC CT CAC C T C AGGAGC T A 2562 

I I Ml I Ml I I I I I I I I I I I I I I Ml Ml | 

Db 24 7 6 T T T GAGT CAAT GAT AGAATAT GAAAAT AAGGAAAAACT C AGT GCT T T G C CAC CT GAGGGA 2535 

Qy 2563 G G AAAG C CAT AT T TAG AGT C T T T T C AG C C CAAT T T AC AT AGT AC AAAAG AT G C TGCA 2619 

I I I M I M I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I | 
Db 2536 GGAAAG C CATAT TT GGAAT C T TT TAAGC T CAGT T T AGAT AAC ACAAAAGAT AC C CT GT T A 2595 

Qy 2 620 T CTAAT GAC AT T CCAACAT T GAC CAAAAAGGAGAAAAT T T C TT T GCAAAT GGAAGAGT T T 267 9 

M I I I I I I M I I M II I I I II I I I I I I I II I I I I I I I II I I I I I I M | | 
Db 25 96 C CT GAT GAAGT T T CAACATT GAGCAAAAAG GAGAAAAT T C CTT TGCAGAT GGAGGAGC T C 2655 

Qy 2680 AAT ACT GCAAT T T ATT CAAAT GAT GAC TT ACT T T CT T C TAAGGAAGACAAAAT AAAAGAA 2739 

I I M M I I I I I I I II II I I I I I M II I I I I I I I II I I I I II I MM I I I I 
Db 2656 AGT AC T G CAGT T T ATT CAAAT GAT GAC T TAT T T ATT T CT AAGGAAGC AC AGAT AAGAGAA 2715 

Qy 27 4 0 AGT GAAAC AT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT C 27 99 

I I M I M M M I I I I I I I I I I I I I II I II II II I I II I I II II II II II 
Db 2716 ACT GAAAC GT T T T C AGAT T CAT C T C CAAT T GAAAT TAT AGAT GAGTT C C CT AC AT T GAT C 2775 

Qy 2800 AGT GCT AAAGAT GATT C T C CTAAAT TAG C CAAG GAGT AC ACT GAT CT AGAAGT AT C C 2856 

M I I I I I I II II I I I II I I I I I I I I || I M I I II I I I I II I II 

Db 277 6 AGT T CTAAAACT GATT CATTTT CTAAATTAGCCAGGGAAT ATACT GACCTAGAAGTAT C C 28 35 

Qy 2857 GACAAAAGT GAAAT T GC TAAT AT C CAAAG C GGGGCAGAT T CAT T GC CTT GCT T AGAAT T G 2 916 

I I I M I I I M I II I I I I M I M II I I I III I II I I I M I II 

Db 2836 CACAAAAGT GAAATT G CTAAT GCC C C G GAT G GAG CT G GGT CAT T GC C T T GCAC AGAAT T G 2895 

Qy 2917 C C C T G T GAC CTTTCTTT C AAGAAT AT AT AT C C T AAAG AT GAAG TACATGTTTCA 2 970 

M I II 1 II I I I II II II I I I I II I I I I | M II I I I I Mill 

Db 2 8 96 C C C CAT GAC CTTTCTTT GAAGAACAT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 2955 

Qy 2971 GAT GAAT T CT C C GAAAAT AGGT C C AGT GT AT CTAAGGC AT C CATAT C G C CT TCAAAT GT C 3030 

M I II II II II II I I II I I I II I III I I I I I II I I I I I II 

Db 2956 GAT GAC T T T T CT AAAAAT GG GT CT GCTAC AT CAAAGGT GCT CT T AT T G C CT CC AGAT GT T 3015 

Qy 3031 TCTGCTTT GGAAC CT CAGAC AGAAAT GGG CAG CAT AGT T AAAT C CAAAT CACT T AC GAAA 3090 

N I II II I I I I I II I I I I I I I II II I I I I II I I I II II III MM 

Db 3016 TCTGCTTTGGC CAC T CAAG C AGAGAT AGAGAGCATAGT TAAAC C CAAAGT T CT T GT GAAA 3075 

Qy 3091 GAAG CAGAGAAAAAACT T C CT T C T GAC AC AGAGAAAGAGGAC AGAT C C C T GT C AG C T GT A 3150 

I I I I I M I II II II II II II II II Mill I I II II I II I I II I I II III II 
Db 3076 GAAGC T GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAG GACAGAT CAC CAT C T G CT AT A 3135 



Qy 3151 TT GT CAG C AG AG C T GAGT AAAAC T T C AGT T GT T GAC CT C C T C TACT G GAGAGAC AT T AAG 3210 

II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 3136 TT T T CAG C AGAGCT GAGT AAAAC T T C AGT T GTT GAC CT C CT GT ACT GGAGAGAC AT T AAG 3195 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I II II I I I I I II I I I I I I 
Db 3196 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3255 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 3256 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3315 

Qy 3331 AT AT AT AAG GG C GT GAT C CAGGC T AT C C AGAAAT C AGAT GAAG GC C AC C CAT T CAG G GCA 3390 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I 
Db 3316 ATATACAAGGGTGTGATCCAAGCTATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCA 3375 

Qy 3391 T ATT T AGAAT CT GAAGT T GCT AT AT C AGAGGAAT T G GT T C AGAAAT AC AGTAAT T C T GCT 3450 

III I I I I I I I I I I I I I I I I I I I I I I I I II I I I II II I I I I II I I I I I I II I I I I I 

Db 3376 T AT CT GGAAT C T GAAGT T GCT AT AT CT GAG GAGT T GGT T C AGAAGTACAGT AAT T CT GCT 3435 

Qy 3451 CTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTA 3510 

I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I II I I II I I I I I I I I I I I I I I 
Db 34 36 CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 34 95 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I Mill I I I I I I I I I I I I I I I I I 
Db 3496 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3555 

Qy 3571 AATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTAT 3630 

I I I I I I I I I I I I II I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3556 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3615 

Qy 3631 GAAC GG C AT C AGGT GC AGAT AGAT CAT TAT C TAG GACT T GCAAACAAGAGT GT T AAGGAT 3690 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I Mill: IN 

Db 3616 GAAC GGC AT CAGGC AC AGAT AGAT CAT T AT CT AGGACT T GCAAAT AAGAAT GT TAAAGAT 3675 

Qy 3691 G C CAT GGC CAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC GCAAAGC AGA 3740 

II Mill II I II II II II I II I I II I II I I I I I I II II II I I I II II 

Db 3676 GC TAT GGCTAAAATC CAAGCAAAAAT CCCT GGAT T GAAGC GCAAAGCTGA 3725 
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ALIGNMENTS 



RESULT 1 
AAD01173 

ID AAD01173 standard; cDNA; 4684 BP. 
XX 

AC AAD01173; 
XX 

DT 02-NOV-2000 (first entry) 
XX 

DE Rat neurite growth inhibitor Nogo A cDNA. 
XX 

KW Rat; neurite growth inhibitor; Nogo A; neural cell; myelin; CNS; 

KW central nervous system; neoplastic disease; antiproliferative; glioma; 

KW antisense gene therapy; neuroblastoma; menagioma; retinoblastoma; 

KW degenerative nerve disease; Alzheimer's disease; Parkinson's disease; 



KW hyperproliferative disorder; benign dysprolif erative disorder; diagnosis; 
KW psoriasis; tissue hypertrophy; neuronal regeneration; treatment; 
KW structural plasticity; screening; ss. 
XX 

OS Rattus sp. 
XX 

FH Key Location/Qualifiers 

FT CDS 253. .3744 

FT /*tag= a 

FT /product^ "Nogo A" 

FT /transl_except= (pos:1462. .1464, aarlle) 

XX 

PN WO200031235-A2. 
XX 

PD 02-JUN-2000. 
XX 

PF 05-NOV-1999; 99WO-US026160 . 
XX 

PR 06-NOV-1998; 98US-010744 6P . 
XX 

PA (SCHW/) SCHWAB M E. 
PA (CHEN/) CHEN M S. 
XX 

PI Schwab ME, Chen MS; 
XX 

DR WPI; 2000-400052/34. 

DR P-PSDB; AAY71310. 
XX 

PT Nogo proteins and nucleic acids useful for treating neoplastic disorders 

PT of the central nervous system and inducing regeneration of neurons. 

XX 

PS Claim 26; Fig 2A; 122pp; English. 
XX 

CC The present sequence is a cDNA encoding rat Nogo A protein which is a 

CC potent neural cell growth inhibitor and is free of all central nervous 

CC system (CNS) myelin material with which it is natively associated. The 

CC present sequence was generated by fusing R018U37-3, R1-3U21 cDNA 

CC sequences isolated from hexanucleotides-primed rat brain stem/spinal cord 

CC library, and 01il8 cDNA from an oligo d(T) -primed rat oligodendrocyte 

CC library. Nogo proteins and fragments displaying neurite growth inhibitory 

CC activity are used in the treatment of neoplastic disease of the CNS e.g. 

CC glioma, glioblastoma, medulloblastoma, craniopharyngioma, ependyoma, 

CC pinealoma, haemangioblastoma, acoustic neuroma, oligodendroglioma, 

CC menagioma, neuroblastoma or retinoblastoma and degenerative nerve 

CC diseases e.g. Alzheimer's and Parkinson 1 s diseases. Therapeutics which 

CC promote Nogo activity can be used to treat or prevent hyperproliferative 

CC or benign dysprolif erative disorders e.g. psoriasis and tissue 

CC hypertrophy. Ribozymes or antisense Nogo nucleic acids can be used to 

CC inhibit production of Nogo protein to induce regeneration of neurons or 

CC to promote structural plasticity of the CNS in disorders where neurite 

CC growth, regeneration or maintenance are deficient or desired. The animal 

CC models can be used in diagnostic and screening methods for predisposition 

CC to disorders and to screen for or test molecules which can treat or 

CC prevent disorders or diseases of the CNS. Note: SEQ ID numbers 35-42 are 

CC referred in claim 32 and SEQ ID NO: 29 in disclosure of the 

CC specification. However the specification does not include sequences for 

CC these SEQ ID numbers 



XX 

SQ Sequence 4684 BP; 1358 A; 1048 C; 1112 G; 1166 T; 0 U; 0 Others- 
Query Match 100.0%; Score 3739.4; DB 3; Length 4684; 
Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3740; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

QY 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

M M I I M I I I I I I I I I I I I M | | M | | I | | | | | | | | | | | | | | M | | | | | | | M | | | | | | 
Db 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

Qy 61 ATCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 120 

I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I II | | || 
Db 61 ATCGCGAAGGCAGGAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 120 

Qy 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

I I I I I I M I I I I I I I II I || I I I I I I I I M | | | | | | || | | | | M I I I I II I I || I I | | | | 
Db 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 18 0 

Qy 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 240 

M I I M I I M I M I I I I M I I I I I I I II I I I I I I I I i I I I I I I I I I I I I I I M I I I I I I I 
Db 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 24 0 

Qy 241 GAC C C GC C AG C CAT GGAAGAC AT AGAC C AGT C GT C GCT GGTCTCCTCGTC C AC G GAC AG C 300 

I I I I M I I I I M I I II II I I I I I I I I I I || | | | | | | || | | | 1 | | | | | | || | || | | | | | || 
Db 241 GAC C C GC CAG C CAT G GAAGACAT AGAC C AGT CGTCGCTGGTCTCCTCGTC C AC GGAC AGC 300 

Qy 301 CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 

I I I I I I I I I I I MINIM I II II II M I I I I II II M M I I 

Db 3 01 CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 3 60 

Qy 361 GACGAGGAGGAGGAGGAGGACGAGGAGGAGGAC GACGAGGACCTAGAGGAACT GGAGGT G 420 

M II II II I II II II I II M II M I II M I I II I I I M I I I II M II I I I M I I M I I II 

Db 361 GACGAGGAGGAGGAGGAGGACGAGGAGGAGGACGACGAGGACCTAGAGGAACT GGAGGT G 42 0 

Qy 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 480 

I I M I I I I II I I M M M II I I II II I I I I I II II II II I I I | | 

Db 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 480 

Qy 481 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 540 

M M M I M M M I M I I I I I I I I | | || | || | | | || | || || | || | | | | || || 

Db 481 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 54 0 



Qy 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

M II I M I I M II I I II I I I II I M I I M I I II M II I II II I II II I II I II I II I I II 

Db 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

Qy 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

I M I M II II I II II I I I II II M I II II I II I I II II II I II II I II I II II I M II II 

Db 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

Qy 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 720 

M M II I I I II II I I M II I II I I I I II I I I II I I I I M I || | || | || | | | || | | || | | | 
Db 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 720 



Qy 



721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 7 80 
I I N I I M II I M II II I I II I I II I || I I M I II I I I II I I || I I I I I | | | | | M II M 



Db 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 780 

Qy 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

M I I I I I I I I I I I I II I I I I I I | | | | | | | | | | | | | | | | | | | M | | | | | | | M I I I I I I | | 
Db 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

Qy 841 TTGATGGAGCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTG 900 

I M I I I I I I I M I I I I II I I I I I I I I I I I I I I I I | | | | | | | | || | | | | | | | M I I I I I M 
Db 841 TTGATGGAGCAGCCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTG 900 

Qy 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

I I I I M I I I I I I II I I I II I I I I I I I I I I I I || M I I I I I I I I I I I I I I I I I I I I | | | | | 
Db 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

Qy 961 CAT GGAT AC CT T G GT AAC T TAT C AGC AGT GT CAT C CT CAGAAG GAACAATT GAAGAAACT 102 0 

M II II M I I I I II I I I I I I I I I I I I I I I I I || I | | | | | | | | | || | | | | M | | | | | | | | | 
Db 961 CAT GGAT AC C T T GGT AAC T TAT C AG C AGT GT CAT C C T CAGAAGGAACAAT T GAAGAAACT 1020 

Qy 1021 T T AAAT GAAG CTT C TAAAGAGTT GC C AGAG AGGGCAACAAAT C CAT T T GT AAAT AGAGAT 10 80 

I I I I I I I I I I I M I I I I M I I I I I I II I II I I I I I I I I I I I I I I I II I I I I I I I I I | | M 
Db 1021 T TAAAT GAAGCTT CTAAAGAGT T GC C AGAGAGGGCAACAAAT C CAT T T GT AAAT AGAGAT 1080 

Qy 1081 TT AGCAGAAT TTT CAGAAT T AGAAT AT T CAGAAAT GGGAT CAT C T T T TAAAGGCT C C C C A 114 0 

I I I I I I I I I I I I I I I I I I I I I I I I | | | || | | | | || | | | | | | | | | | | | || | | | | 

Db 1081 T TAG CAGAAT T TT CAGAAT T AGAAT AT T CAGAAAT GGGAT CAT CT TT T AAAG G CT C C C CA 114 0 

Qy 1141 AAAGGAGAGT CAGC C AT AT T AGT AGAAAAC AC TAAGGAAGAAGT AAT T GT GAGGAGT AAA 1200 

I M I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1141 AAAGGAGAGT CAGC CAT AT T AGT AGAAAAC ACT AAGGAAGAAGTAAT T GT GAGGAGTAAA 12 00 

Qy 12 01 GACAAAGAG GAT T T AGT T T GT AGT GC AG C C C T T C ACAGT C CACAAGAAT C AC CT GT GGGT 12 60 

I M I I I I I I I I II II I I II I I I I I I I I I I I | | | | | | | | || | | M | | | | | | | | M I I I I I I 
Db 1201 GACAAAGAGGAT T TAGT T T GT AGT G C AGC CCT T CACAGT C CACAAGAAT C AC CT GT GGGT 1260 

Qy 1261 AAAGAAGACAGAGTT GT GT CT C CAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT G 1320 

M I I I I I I II I II I I I I I I M I I I I I I I M I I I I I I I I I I II I I I I I I I I I I 

Db 12 61 AAAGAAGACAGAGT T GT GT CT C CAGAAAAGACAAT GGACATT T T T AAT GAAAT GCAGAT G 1320 

Qy 1321 T C AGT AGT AG C AC CT GT GAGGGAAGAGT AT G C AGACT T TAAGC CAT T T GAACAAG CAT GG 1380 

I I I M I I I I I I I I I I M I I I II I I I II I I I I I I I I I I I I I I I I II I I I I || | | | | | | M | 
Db 1321 T C AGT AGT AGC AC CT GT GAG G GAAGAGT AT GC AGACT T TAAG C CAT T T GAACAAGC AT GG 1380 

Qy 1381 GAAGT GAAAGAT ACTTAT GAGGGAAGT AGGGAT GT GCT GGCT GCTAGAGCTAAT GT GGAA 144 0 

I I I I I I I I I I I I I I M M I I I I I I I II I II I I II I I I I I M I I I I I I I I I I I I I I I I II I 
Db 1381 GAAGTGAAAGATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCTAATGTGGAA 1440 

Qy 1441 AGTAAAGT GGACAGAAAAT GCTT GGAAGATAGC CT GGAGCAAAAAAGT CTT GGGAAGGAT 1500 

I I M I I I I I I I I I I I I II I I I I I I I I I I I I I I I || | | | | | | | | | | | | | | | | || I I II I I I 
Db 1441 AGTAAAGT GGACAGAAAAT G CT T GGAAGATAGC CT G GAGCAAAAAAGT CTT GG GAAGG AT 1500 

Qy 1501 AGT GAAGGC AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C CAGAAC CT GT GAAGGACAGC 1560 

I I M I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | M I I I I I I I I I I I I I 

Db 1501 AGT GAAGGC AGAAAT GAGGAT GCTTCTTTCCC CAGTAC C C CAGAAC C T GT GAAGGACAGC 1560 

Qy 1561 T C C AG AGC AT AT AT T AC CTGTGCTTCCTT TAC C T CAGCAAC C GAAAGC AC CAC AG CAAAC 162 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I II II I I I I I I I I 

Db 1561 T C C AGAGC AT AT AT T AC CT GTGCTTCCTT TAC CT C AG CAAC C GAAAG CAC C AC AGCAAAC 1620 



Qy 1621 ACT T T C C CT T T GT T AGAAGAT CAT ACT T C AGAAAATAAAAC AGAT GAAAAAAAAAT AGAA 168 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1621 AC TTTCCCTTTGT T AGAAGAT CAT ACT T CAGAAAAT AAAAC AGAT GAAAAAAAAAT AGAA 168 0 

Qy 1681 G AAAG G AAG G C C C AAAT T AT AAC AGAGAAGACT AG C C C C AAAAC GT C AAAT CCTTTCCTT 174 0 

I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I t II I I I II I I I I I I 
Db 1681 GAAAGGAAGGCCCAAATTATAACAGAGAAGACTAGCCCCAAAACGTCAAATCCTTTCCTT 174 0 

Qy 1741 GT AGC AGT AC AGGAT T C T GAGGC AGAT TAT GT T ACAAC AGAT AC C T TAT CAAAGGT G ACT 1800 

I I I M I M I M II I I II I I I I M II I I I I I I I II II I I I || I I I I I I | | | | | I I I I || M 
Db 1741 GT AG C AGT AC AG GAT T CT GAGGC AGAT TAT GT T AC AAC AGAT AC CT TAT CAAAGGT GAC T 18 00 

Qy 18 01 GAG G C AG C AGT GT C AAAC AT GC CT GAAGGT CT GAC GC C AGAT T T AGT T C AGGAAGC AT GT 1860 

M I M I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I II || I I I I | I I | I I I I | | | | 
Db 1801 GAG GC AG C AGT GT CAAACAT GC C T GAAGGT C T GAC GC C AGAT T T AGT T C AGGAAGCAT GT 18 60 

Qy 1861 GAAAGT GAACT GAAT GAAGCCACAGGT ACAAAGATT GCTTAT GAAACAAAAGTGGACT T G 1920 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1861 GAAAGT GAACT GAAT GAAGC CAC AG GT ACAAAGATT GCT T AT GAAACAAAAGT GGACT T G 192 0 

Qy 1921 GT C CAAACAT C AGAAGC T AT ACAAGAAT C ACT T T AC C C CAC AGCAC AGCT T T GC C CAT C A 198 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I II I I I M I I I I I II I I 
Db 1921 GT C CAAACAT CAGAAG CTAT ACAAGAAT C ACT T TAC C C CAC AG C ACAG CT T T G C C CAT C A 198 0 

Qy 1981 T T T GAGGAAGCT GAAGC AACT C C GT CAC CAGT T T T GC CT GAT ATT GT TAT GGAAG CAC C A 2 040 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I 
Db 1981 T T T GAGGAAG C T GAAG CAACT C C GT CAC C AGTT T T GC C T GAT ATT GT TAT GGAAGCAC C A 204 0 

Qy 2 041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 2041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

Qy 2101 GAAG CAC C T C CT C CAGT T AGT TAT GAC AGT ATAAAGCT T GAGC CT GAAAAC C C C C CAC C A 2160 

I I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I i i I I I I I I I I I I I I 
Db 2101 GAAGC AC CT C CT C CAGT T AGT TAT GAC AGT AT AAAGCT T GAGC CT GAAAAC C C C C CAC C A 2160 

Qy 2161 TATGAAGAAGC CAT GAATGT AGCACTAAAAGCTTT GGGAACAAAGGAAGGAATAAAAGAG 222 0 

I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I M 
Db 2161 TAT GAAGAAGC CAT GAAT GT AG CAC TAAAAGCTT T GGGAACAAAGGAAGGAATAAAAGAG 2220 

Qy 2221 C C T GAAAGT TT TAAT G CAGCT GTT C AGGAAACAGAAG CT C CT T AT AT AT C CAT T GC GT GT 22 8 0 

I I I I I I M I II I I I I I I I I I I I I I I I I II M I I I I II I I I I I I I II I I I I I II M I I I I I 
Db 2221 C CT GAAAGT TT TAAT GCAG CT GTT CAGGAAAC AGAAGC T C CT T AT AT AT C CATT G C GT GT 228 0 

Qy 2281 GAT T TAAT TAAAGAAACAAAG C TCT C C ACT GAGC CAAGT C CAGAT T T CT C TAATT AT T C A 234 0 

I I I M I I I I I I I I I I I I I I I II I I I M I I M I I I I II I I I I I II II I I I I I II I I I I I I I 
Db 2281 GAT T TAAT T AAAGAAACAAAGCT CT C CAC T GAG C CAAGT C CAGAT T T C T CT AAT TAT T C A 234 0 

Qy 2341 GAAAT AG C AAAATT C GAGAAGT CG GT GCC C GAACAC GCT GAG CTAGT GGAGGATT C CT C A 24 00 

I M I I I I I I I I I I I I I I I I I I I || || I I I I I I I I I I I I || | | | | | | | | | | | | | | | | | M | 
Db 2341 GAAAT AG CAAAAT T C GAGAAGT C GGT G CC C GAACAC GCT GAGCT AGT G GAG GATT CCT C A 2400 

Qy 2401 C CT GAAT CT GAAC CAGTT GACTTATTT AGT GAT GATT C GATT C CT GAAGT CCCACAAACA 2460 

I I I I M I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I II II I I I I I I I I I I I I 
Db 2401 C CT GAAT C T GAAC CAGT T GACT T AT T T AGT GAT GAT T C GAT T C CT GAAGT C C CACAAAC A 24 60 



Qy 2461 CAAGAGGAGGC T GT GAT GCT CAT GAAGGAGAGT C T C ACT GAAGT GT CT GAGAC AGT AGC C 252 0 

1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I I I 
Db 2461 CAAGAGGAGGC T GT GAT GCT CAT GAAGGAGAGTCT CACT GAAGT GT CT GAGACAGTAGC C 2520 

Qy 2521 C AG C ACAAAGAG GAGAGAC T T AGT G C C T C AC C T CAGGAGCT AG GAAAG C CAT AT T T AGAG 258 0 

M I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I II I I I I | | I I I I I | I I I 
Db 2521 C AG CACAAAGAG GAGAGAC T T AGT G C CT CAC C T C AGGAG CT AGGAAAGC CAT AT T T AGAG 2580 

Qy 2581 T CT T T T C AG C C CAAT T T AC AT AGTACAAAAGAT GCT G CAT C T AAT GAC ATT C CAACATT G 2640 

I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I | I || 
Db 2581 T CT T T T CAG C C CAAT T T ACAT AGT ACAAAAGAT GCT G CAT CT AAT G ACAT T C C AACAT T G 2 64 0 

Qy 2 641 AC CAAAAAGGAGAAAATT T CT T T GCAAAT G GAAGAGT T T AAT ACT GC AAT T TAT T CAAAT 2700 

I I I M II I I I I I I I I I I II I I || M | | M I I I I I I I I I I I I I I I I I I I | | | || | | | | | | | 
Db 2641 AC CAAAAAGGAGAAAATT T C T T T GCAAAT GGAAGAGT T T AAT ACT GCAAT T TAT T CAAAT 2700 

Qy 2701 GAT GACT T ACT T T C T T C TAAGGAAG ACAAAATAAAAGAAAGT GAAAC AT T T T CAGATT C A 2760 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I || | | | 
Db 2701 GAT GAC T TACT T T CT T CTAAGGAAGACAAAAT AAAAGAAAGT GAAAC AT TT T C AGAT T C A 2760 

Qy 27 61 T CT C C GAT T GAGAT AAT AGAT GAAT TT CC CACGTTT GT CAGT GCTAAAGAT GATT CT CCT 282 0 

I I I I I I M I I I M M I I I I I I I I I I I M I I I I I I I I I I II I I I I II I II I I I I I I I | | || 
Db 2761 T CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT CAGT GCTAAAGAT GAT T CT C CT 282 0 

Qy 2 821 AAATTAGCCAAGGAGTACACTGATCTAGAAGTATCCGACAAAAGTGAAATTGCTAATATC 28 80 

M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | | | | 
Db 2821 AAAT TAGCCAAGGAGT ACACTGAT CTAGAAGT AT C C GACAAAAGT GAAATT GCTAATAT C 2880 

Qy 2881 CAAAGC G GGGC AGAT T C ATT G C CT T GCTT AGAAT T G CC CT GT GAC CT T T C T T T CAAGAAT 294 0 

I I I I I I I I I I M I I I I II I I II I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 28 81 CAAAGC GGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGT GAC CTTTCTTT CAAGAAT 294 0 

Qy 2 941 AT AT AT C CT AAAGAT GAAGT AC AT GT T T CAGAT GAATT CT C C GAAAAT AGGT C CAGT GT A 3000 

I I I I I I I I I I I II I I II I I I I I I I I II M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 2 941 AT AT AT C C TAAAGAT GAAGT AC AT GT T T CAGAT GAAT T CT C C GAAAAT AG GT C CAGT GT A 3000 

Qy 3001 TCTAAGGCATCCATATCGCCTT CAAAT GTCTCTGCTTTGGAACCTCAGACAGAAATGGGC 3060 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3001 T CT AAG G CAT C CAT AT C G C CTT CAAAT GT CTCTGCTTT GGAAC CT C AGAC AGAAAT GGGC 3060 

Qy 3061 AG CAT AGT T AAAT C CAAAT CAC T T AC GAAAG AAG C AGAGAAAAAAC TTCCTTCT GAC AC A 3120 

I I I I I M I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I || | I | | | | | | | | | | | | | | | | 
Db 30 61 AGCAT AGT TAAAT C CAAAT CACT T AC GAAAGAAGCAGAGAAAAAACT T C CT T CT GAC AC A 3120 

Qy 3121 GAGAAAGAGGAC AGAT C C CT GT C AGCT GT AT T GT CAGC AGAGCT GAGT AAAAC T T CAGT T 3180 

M I I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I I | | | | | | | | | | M II I II II 
Db 3121 GAGAAAGAG GACAGAT C C C T GT CAG C T GT AT T GT CAG C AGAGCT GAGTAAAACT T CAGT T 3180 

Qy 3181 GTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTA 3240 

M I M I I I I I I I I M I I I I I I I I I I I I I I I I II I II I I I I II I I I I I I I I I II I I I I I I I 
Db 3181 GTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTA 324 0 

Qy 3241 TT CCT GCT GCT GTCTCTGACAGTGTTCAGCATTGT CAGT GTAAC GGCCT ACAT TGCCTTG 3300 

I M I I I I I I I I I I I I I I II I I I I I I I I II I II I I I I I I I I I II I I I I I I I 

Db 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

Qy 3301 GCCCTGCTCTC G GT GACT AT CAG CTT TAG GAT AT AT AAG G GC GT GAT CC AGG CT AT CC AG 3360 



Db 3301 G C C C T GCT CT C G GT GACT AT CAG C T T TAG GAT AT AT AAGG GCGT GAT C C AGGCT AT C C AG 3360 

Qy 3361 AAAT CAGAT GAAGGCCAC CCATT CAGGGCAT ATTTAGAAT CTGAAGTT GCTATATCAGAG 342 0 

M I I I I M M I I I I I II I I I I I I I I I I I I I I I | | | | M I I I I I I I I I I I II I I I II | I I I 
Db 3361 AAAT CAGAT GAAGG C CAC C C ATT CAG G G CAT AT T T AGAAT CT GAAGT T GCT AT AT CAGAG 342 0 

Qy 3421 GAATT GGTT CAGAAATACAGTAATT CT GCT CTT GGT CAT GT GAACAGCACAATAAAAGAA 34 8 0 

I I I M I I I I I I I I I M I I I I I I I I I I I II I I II I I I I I II I I I I I I I I || I I I I || | | || 
Db 3421 GAATT GGTT CAGAAATACAGTAATT CT GCT CTT GGT CAT GT GAACAGCACAATAAAAGAA 3480 

Qy 34 81 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATG 354 0 

I I I I I I I I M I I I I M I I M I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I II 

Db 34 81 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATG 354 0 

Qy 3541 TGGGTGTTTACTTATGTT GGT GCCTTGTTCAAT GGT CTGACAC TACT GATTTTAGCTCTG 3600 

I I M I I I I M I M I I I I I I I I M I I II I I I I I I I I II I I I I I I I I I II I I I I I I I I | | | | 
Db 3541 TGGGTGTTTACTTATGTT GGT GCCTTGTTCAAT GGT CTGACAC TACT GATTTTAGCTCTG 3600 

Qy 3601 AT C T CAC T C TT C AGT AT T C CT GT T ATT TAT GAAC GG C AT CAG GT GCAGAT AGAT CAT TAT 3660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I II I I I I [ II I I I I I I || | | M 
Db 3601 AT C T CAC T CTT C AGT AT T CC T GT T AT T TAT GAAC GGCAT C AG GT GCAGAT AGAT CAT TAT 3660 

Qy 3661 CT AGGAC T T GCAAACAAGAGT GT T AAG GAT G C CAT GGC CAAAAT C CAAG CAAAAAT C C C T 372 0 

I I I M II I I I I I I I I I I I I I M I I | | M || || | | | | | | | | | M | || | | | | | | | | || | | | | 
Db 3661 C T AGGACT T GCAAACAAGAGT GT TAAG GAT GC C AT GGC CAAAAT C CAAGCAAAAAT C C C T 3720 

Qy 3721 G GAT T G AAG C G C AAAG CAGAT 3741 

I I I I I I I I I I I I I I I I I I I I I 
Db 3721 G GAT T GAAG C G CAAAGC AGAT 3741 
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CC to uremia, porphyria, hypoglycemia, Sjorgren Larsson syndrome, acute 

CC sensory neuropathy, chronic ataxic neuropathy, biliary cirrhosis, primary 

CC amyloidosis, obstructive lung diseases, acromegaly, malabsorption 

CC syndromes, polycythemia vera, immunoglobulin (Ig)A- and IgG gamma- 

CC pathies, complications of various drugs (e.g., metronidazole) and toxins 

CC (e.g., alcohol or organophosphates) , Charcot-Marie-Tooth disease, ataxia 

CC telangectasia, Friedreich's ataxia, amyloid polyneuropathies, 

CC adrenomyeloneuropathy, Giant axonal neuropathy, Ref sum's disease, Fabry's 

CC disease, or lipoproteinemia . The present sequence represents a DNA 

CC encoding the rat neurotransmitter receptor protein Nogo (Nogo-A, Nogo-B 

CC and Nogo-C) , an example of NS-specific antigen 

XX 

SQ Sequence 4684 BP; 1358 A; 1047 C; 1112 G; 1167 T; 0 U; 0 Other; 



Query Match 100.0%; Score 3739.4; DB 6; Length 4684; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 374 0; Conservative 0; Mismatches 1; Indels 0; Gaps 0 



Qy 


i 


ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 
1 1 M 1 1 1 1 M 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 II 1 1 1 1 1 M I I I I | | | | | | 
ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 


60 


Db 


i 


60 


Qy 


61 


ATCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 || | | | | | || | | | | | | | | | | | | 
ATCGCGAAGGCAGGAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 


120 


Db. 


61 


120 


Qy 


121 


CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 
1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 I I I I I | | | | | | | | | | || | | | | | | || || | | | | | | | | || | || | | 

CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 


180 


Db 


121 


180 


Qy 


181 


ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 
M 1 M 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I M | | | | | | | | | | || | | | | M | | | | | 
ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 


240 


Db 


181 


240 


Qy 


241 


GAC C C G C CAG C CAT GGAAGACATAGAC C AGT C GT CGCTGGTCTCCTCGTC CAC GGACAGC 

1 1 1 N 1 II 1 II 1 1 1 1 1 II 1 1 II M 1 1 1 1 1 II 1 1 1 1 II 1 1 1 | | | || | | | | || || | m | M | 

GAC C C G C CAG C CAT GGAAGACATAGAC C AGT C GT CGCTGGTCTCCTCGTC CAC GGACAGC 


300 


Db 


241 


300 


Qy 


301 


CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 
1 1 1 M II 1 II 1 1 1 1 II 1 1 1 II 1 1 I I I I | | | | | | || | | | | | | | | || || | | || | | | | | | | | | 

CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 


360 


Db 


301 


360 


Qy 


361 


GAC GAG GAGGAGGAGGAG GAC GAGGAG GAGGAC GAC GAGGAC CT AGAG GAACT GGAGGT G 

M 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I II I | | M | | | || | | || | | | | | | | | | | | | || 

GAC GAGGAG GAGGAGGAG GAC GAGGAG GAG GAC GAC GAGGAC CTAGAGGAACT GGAGGT G 


420 


Db 


361 


420 


Qy 


421 


CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 
1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 II 1 II 1 1 1 1 M 1 || | M | | || 
CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 


480 


Db 


421 


480 


Qy 


481 


CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 

1 1 1 M 1 1 1 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | | | | | | | M | M | | | | | | | | 

CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 


540 


Db 


481 


540 


Qy 


541 


GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I 1 1 1 II 1 1 1 1 1 1 1 1 
GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 


600 


Db 


541 


600 


Qy 

Db 


601 
601 


CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 
1 I 1 1 t ! t 1 1 1 1 1 1 t t 1 t 1 1 1 1 1 1 1 l l l l l i l i t i i i i i i i i i i i i i i i i i i i i i i i i i i i 
1 f 1 1 IE 1 1 1 1 1 1 1 M M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 

CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 


660 
660 


Qy 


661 


CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 

1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 I I I I | | | | | | | | | | | | | | | | M | | | | | | | | | | | | | | | 

CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 


720 


Db 


661 


720 


Qy 


721 


CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 

1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 I I I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 II 1 1 1 1 1 
CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 


780 


Db 


721 


780 



Qy 



781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 



781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

841 TT GAT G GAGC AGC CAG GTAAC AC TGTTTCGTC T GGT CAAGAG GAT T T C C CAT CT GTC CT G 900 

I I I M I I I I I I I I I I M I I I I I I I I i I I I I I I | | | | | | | | | | | | | | | | ! | M | I I I | I I I 
841 TT GAT G GAG CAGC CAG GT AACAC TGTTTCGTCTGGT CAAGAGGAT T T C C C AT CT GT C CT G 90 0 

901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTT7VAAGAA 960 

I Mil I M I I I I I I I I I I | M | | | | | | | | | | | M I I II I I I I I I 

901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

9 61 CAT GGAT AC CT T G GTAACT TAT CAGC AGT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT 1020 

I I I I I M I I I I I I M I I I I M I I I II I I I I I I I I I I I I I I I I I I I I || | | | | | | | 

961 CAT GGAT AC CT T GGTAACT T AT CAG C AGT GT CAT C CT C AGAAG GAACAAT T GAAGAAACT 1020 

1021 TTAAAT GAAGCTT CTAAAGAGTT GCCAGAGAGGGCAACAAAT C CATTT GTAAAT AGAGAT 1080 

I I I I I I I I I I I M I I I I I I I II I I I I I I I I I M I I I I I I I I I I I I I I | | | | | | | | | M | | 
1021 T T AAAT GAAG CT T C TAAAGAGT T G C C AGAGAGGG CAACAAAT C CAT TT GTAAAT AGAGAT 1080 

1081 T T AGCAGAAT T T T C AGAAT T AGAAT AT T CAGAAAT G GGAT CAT C T T TTAAAG GCT C C C C A 1140 

I I I I I I I I I I M I I I I II II || I I I I | | | | | | | | | | | | || | | | | || | | | || | M | | || | | 
1081 T T AGCAGAAT T TT C AGAAT T AGAAT AT T CAGAAAT G G GAT CAT CT T T T AAAG GC T C C C C A 1140 

1141 AAAGGAGAGT CAG C CAT AT T AGT AGAAAAC ACT AAGGAAGAAGTAAT T GT GAGGAGT AAA 12 00 

I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I || | | | | | | | M 
1141 AAAGGAGAGT CAGC CAT AT T AGT AGAAAAC ACT AAGGAAGAAGTAAT T GT GAGGAGT AAA 12 00 

1201 GACAAAGAGGAT T T AGT T T GT AGT GCAGCC CT T CAC AGT C C ACAAGAAT C ACCT GT GGGT 12 60 

I I I I I I I M I I I I I I I I I M I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I | | || | | | 
1201 GACAAAGAG GAT T T AGT T T GT AGT GC AGCC CT T CAC AGT C C AC AAGAAT C ACCT GT GGGT 12 60 

1261 AAAGAAGACAGAGT T GT GT CT C CAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT G 1320 

I I I I I I I I I M I I II II I I I I I I I I | | | | | M | | | | | | || || | | | | || | | | | | | | || | | | 
1261 AAAGAAGACAGAGT T GT GT CT C CAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT G 1320 

1321 T CAGTAGTAGCACCT GTGAGGGAAGAGTAT GCAGACTTTAAGCCATTTGAACAAGCAT GG 1380 

I I I I I I I I M I I II I II I I I I I I I | I | | | | | | | | | || | | || | | | | | | | | || | | | | | | | | | 
1321 T CAGT AGT AGC AC CT GT GAGG GAAGAGT AT GC AGACT T T AAGCCATT T GAACAAGCAT GG 138 0 

1381 GAAGT GAAAGAT ACT T AT GAG G GAAGT AG G GAT GTGCTGGCTGC T AGAGCT AAT GT GGAA 14 4 0 

I I M I I II I I I I I I I I I I I I I I I I I | | | | | | || | | | | | | | | | | | | | | | | || | | | | | || | | 
1381 GAAGT GAAAGAT ACT TAT GAGGGAAGT AGGGAT GTGCT GGCT GCT AGAGCT AAT GTGGAA 144 0 

1441 AGT AAAGTG GAC AGAAAAT GCT T GGAAGAT AG C C T G GAGCAAAAAAGT CT T G GGAAGGAT 1500 

I I I I I I M I I M I I I I M I I I I I I I I I I I M I I I II I I I I I II I I I I I II I I I I I I I I I I 
1441 AGTAAAGT G GAC AGAAAAT G CT T GGAAGAT AG C CT GGAG CAAAAAAGT CTT GGGAAGGAT 1500 

15 01 AGT GAAG GCAGAAAT GAG GAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGACAG C 1560 

I I I I I I I I I I I I I I I I I I I I I I I I | M I I I I I I I II I I I I I I I I I I I I I I I I I 

1501 AGT GAAGGC AGAAAT GAGGAT GCTTCTTT C CC CAGT AC C C C AGAAC CT GT GAAGGACAG C 1560 

1561 T C CAGAGCAT AT AT T AC CT GT G C T T C CTT T AC C T CAGCAAC C GAAAG CAC C AC AGCAAAC 162 0 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I II 

15 61 T C CAGAG CAT AT AT T AC C T GT GC T T C CTT TAC C T CAG CAAC C GAAAG CAC CAC AGCAAAC 162 0 

1621 ACT T TCCCTTTGT T AGAAGAT CAT AC T T CAGAAAAT AAAAC AGAT GAAAAAAAAATAGAA 168 0 
I I I I I I I I I I I I M I 1 II I I I I I I I I I I I I II I I || | | | | | | | | | | | | | | | | | | | | || | | 



Db 1621 AC TTTCCCTTTGT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AGAT GAAAAAAAAAT AGAA 1680 

Qy 1681 GAAAGGAAGG C C CAAAT T ATAACAGAGAAGAC TAG C C C CAAAAC GT CAAAT CCTTTCCTT 174 0 

I M I I I I I I I II I I I I I I I I I I ! I I I I 1 I I I I I I I I I I I I I I I I I I | I I || M I I I I I I I 
Db 1681 GAAAGGAAGGC C CAAATT AT AAC AGAGAAGAC TAG C C C CAAAAC GT CAAAT CCTTTCCTT 1740 

Qy 1741 GT AGC AGT AC AG GAT T CT GAG GC AGAT TAT GT T ACAAC AGAT AC C T TAT CAAAGGT GACT 18 00 

I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I || I || | | 
Db 1741 GTAGCAGTACAGGATT CT GAGGCAGATTAT GTTACAACAGATACCTT AT CAAAGGT GACT 1800 

Qy 1801 GAGGCAGCAGT GT CAAACAT GCCT GAAGGT CT GACGC CAGATTTAGTT CAGGAAGCAT GT 1860 

I I I M I I I I I I I M I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I || I | | I M | | | M | | 
Db 1801 GAGGCAGCAGT GT CAAACAT GCCT GAAGGT CT GACGC CAGATTTAGTT CAGGAAGCAT GT 1860 

Qy 1861 GAAAGT GAACT GAAT GAAG C C AC AG GT ACAAAGATT GC T TAT GAAAC AAAAGT GGAC T T G 1920 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I M I 
Db 18 61 GAAAGT GAACT GAAT GAAG C C AC AG GT ACAAAGAT T G CT T AT GAAACAAAAGT G GAC T T G 1920 

Qy 1921 GT C CAAACAT C AGAAGCT ATAC AAGAATCACT T T AC C C CAC AG C ACAGCT T T GC C CAT CA 198 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I || I M II 
Db 1921 GT C CAAACAT C AGAAG CT AT ACAAGAAT CACT T T AC C C CAC AG C AC AGCT T T GC C CAT CA 198 0 

Qy 1981 T T T GAG GAAG CT GAAG CAAC T C C GT C ACCAGT T T T GC CT GAT ATT GT TAT GGAAGCAC C A 2 04 0 

I M I I I I I I II I I I II I I I I I | | | | | | | | | | | | | | | | | | || | | | | | | | | | | | | | || | || | 
Db 1981 T T T GAG GAAGCT GAAG CAACT C C GT C ACCAGT T T T GCCT GAT ATT GT TAT G GAAGCAC C A 2 040 

Qy 2 041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

I I I I M M I I I I I II I I I I I I I I I I I I I | | | | M | || | | | | | | | | | | || | | | | | 

Db 2041 TT7WVTTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

Qy 2101 GAAGCAC CT C CT C C AGT T AGTT AT GACAGTAT AAAG CT T GAGC C T GAAAAC C C C CC AC C A 2160 

I I I I I I I I M I I I I I I II I I I I II I I I | | | | | | | | | | | | | | | | | | | | | || | | | | | | | || | 
Db 2101 GAAGCAC C T C CT C C AGT T AGTT AT GAC AGT AT AAAG CT TGAGC CT GAAAAC C C C C CAC C A 2160 

Qy 2161 TATGAAGAAGCCATGAATGTAGCACTAAAAGCTTTGGGAACAAAGGAAGGAATAAAAGAG 2220 

M I I I M I I I II I I I I I I I || I I I | | | | | | | | | | | | M I I II I I I I I I I I I I I I | | I I | | 
Db 2161 TAT GAAGAAGC CAT GAAT GT AG CAC TAAAAGC T T T GGGAACAAAGGAAG GAAT AAAAGAG 2220 

Qy 2221 CCTGAAAGTTTTAATGCAGCTGTTCAGGAAACAGAAGCTCCTTATATATCCATTGCGTGT 2280 

I I M I I I I I I I I I I I I I I || | | | | | | | | | | | | | | | | | | | | | || || | | | | | || | | | | | | || 
Db 2221 CCT GAAAGT TTTAATGCAGCTGTTCAGGAAACAGAAGCTCCTTAT AT AT CC ATT GCGTGT 2280 

Qy 2281 GAT T T AAT TAAAGAAAC AAAG CT C T C CACT GAG C CAAGT C CAG AT T T CT CT AAT T ATT CA 2 34 0 

I I I I I I I M I I I II I I I II I | I I | | | | | | | | | | M | M | | | | | | || | | | | | | | | | | | | | | 
Db 22 81 GAT T T AAT T AAAGAAACAAAGCTCT C CACT GAGC CAAGT C C AGAT T T C T CT AAT T ATT C A 234 0 

Qy 2341 GAAAT AG CAAAAT T C GAGAAGT CGGTGCCC GAACAC GCT GAG C T AGT G GAG GAT T C CT C A 2400 

M I I I I I I I I I I I I I II I I I I I I I | I I | | | | | || || | | | | | | | | | M II I I I I I I I I I I I 
Db 2341 GAAAT AGC AAAAT T C GAGAAGT CGGT GC C C GAACAC GC T GAGC T AGT GG AG GAT T C CT CA 24 00 

Qy 2401 CCT GAAT C T GAAC CAGT T GACT T ATT T AGT GAT GAT T C GAT T C CT GAAGT C C C ACAAAC A 2460 

I I I I I I II I I I I I M I I I II I I I II I I I II I | I || I | | || | | | | | || M 

Db 2401 CCT GAAT CT GAAC CAGT T GACT TATT T AGT GAT GAT T C GAT T C CT GAAGT C C C ACAAACA 24 60 

Qy 24 61 CAAGAGGAGGCT GT GAT GCT CAT GAAG GAGAGT CT CACT GAAGT GT CT GAG AC AGT AGC C 2520 

I I I I I I I I I I I I I I II M I I I I I I I I I I I I II I I I | | | | | || I M 

Db 24 61 CAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GAAGT GT CT GAGACAGTAGCC 2520 



Qy 2521 C AGC ACAAAGAGGAGAGACT T AGT G C CT C AC CT CAGGAGC T AGGAAAG C CAT AT T TAGAG 258 0 

I M I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I | I I I 1 I I I I | I | | 
Db 2521 C AGC AC AAAGAGGAGAGACT T AGT GCC T C AC C T CAG GAGC T AGGAAAGC CAT AT T TAGAG 2580 

Qy 2581 T C T T T T C AGC C CAAT T T AC AT AGT AC AAAAGAT G CT G CAT CT AAT GAC AT T C CAACAT T G 264 0 

I I I I I I I I I M I I I I I I I I I I I I I I I I II II I I I I II I I I I I I I I I I I I I I I I I I I | I I I 
Db 2581 T CT T T T CAGC C CAAT T T ACAT AGT ACAAAAGAT G CT G CAT C T AAT GAC AT T C CAACAT T G 264 0 

Qy 2641 AC CAAAAAG GAGAAAAT T T CT T T GCAAAT G GAAGAGT T T AAT ACT G CAAT T TAT T CAAAT 2700 

I I I I I M I I I I I I I I II I I I I I I I I M I I I I I | | | | | | | | | | | | | | | | || | M I I I I I II 
Db 2641 AC CAAAAAGGAGAAAAT T TCT T T GCAAAT GGAAGAGTT TAAT ACT GCAAT T TAT T CAAAT 2700 

Qy 2 7 01 GAT GAC T TACT T T C T T CT AAGGAAGACAAAATAAAAGAAAGT GAAAC AT T T T CAGAT T C A 2760 

I M I I I I I I I I I I I I I I II I I I I I I I I I M I I M I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 2701 GATGACTTACTTTCTTCTAAGGAAGACAAAATAAAAGAAAGTGAAACATTTTCAGATTCA 2760 

Qy 2761 T C T C C GAT T GAGAT AATAGAT GAAT T T CC CAC GT T T GT CAGT GCT AAAGAT GAT T CT C C T 2820 

I I I I I I I I I I I I I I I II I I I I I I I II II I I I I I I I I I I I I I I i I I I I I M I I I | | | | | M 
Db 2761 T CT C C GAT T GAGAT AAT AGAT GAATT T CC C AC GTT T GT CAGT G CTAAAGAT GAT T CT C C T 2820 

Qy 2 821 AAATTAGCCAAGGAGTACACTGAT CT AGAAGTAT CCGACAAAAGT GAAATT GCTAATAT C 2880 

M I M I I I I I I I I I I I I I I II I I I I I I I I I | I | | | | | | | | | | | | | | | | | || M I I I I I I I 
Db 2 821 AAAT T AGC CAAGGAGT AC ACT GAT CT AGAAGTAT C C GACAAAAGT GAAAT T G C TAAT AT C 2880 

Qy 2 8 81 CAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAAT 294 0 

I I I I I I M I I II I II I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 28 81 CAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAAT 294 0 

Qy 2941 AT AT AT C C T AAAGAT GAAGT AC AT GT T T CAGAT GAAT T CT C C GAAAAT AGGT C CAGT GT A 3000 

I I I I I I M I I I I I I I I I I I I I I I I I I I II II I I II I I I II I I I I I I I I I I I I I I I M I I I 
Db 2 941 AT AT AT C CTAAAGAT GAAGT ACAT GT T T CAGAT GAAT T CT C C GAAAAT AG GT C CAGT GT A 3000 

Qy 3001 T CTAAGGCAT C CATATCGCCTT CAAAT GTCTCT GCT TTGGAACCTCAGACAGAAATGGGC 3060 

I M I II I I I I I I I I I I I I I I I II I I I I I I I I I | | | | M I I I I I I I I II I II I I I I I I I I I 
Db 3001 T CT AAG G CAT C CAT AT C GC C T T CAAAT GT CT CT GCT T T GGAAC CT C AGAC AGAAAT GGGC 3060 

Qy 30 61 AG C AT AGTTAAAT C CAAAT C ACTT AC GAAAGAAGCAGAGAAAAAAC TTCCTTCT GACAC A 3120 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | II I I I II I I I I I I I I I I I I I I I i I 
Db 3061 AGC AT AGT T AAAT C CAAAT CAC TT AC GAAAGAAGCAG AGAAAAAACT T C C T T CT GACAC A 3120 

Qy 3121 GAGAAAGAGGACAGATCCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTT 3180 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 3121 GAGAAAGAGGACAGATCCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTT 3180 

Qy 3181 GTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTA 324 0 

M I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 3181 GTT GAC CT C CT C T AC T GG AGAGAC AT T AAGAAGAC T G GAGT GGTGTTTGGT GC CAGCT T A 324 0 

Qy 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

I I I I I I I II I I I I I I I II I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

Qy 3301 GCCCTGCTCTCGGT GAC TAT C AGCT T TAG GAT AT AT AAGG GC GT GAT C CAG G C TAT C CAG 3360 

I I I I M I I I I I I I || I I I I I I I I I I I I | | | | | | | | | | | | | | || | | M M | | | | || | | | | | 
Db 3301 GCCCTGCTCTC G GT G ACT AT C AGC T T TAG G AT ATATAAGG GC GT GAT C C AGG CT AT C CAG 3360 



Qy 3361 AAAT C AGAT GAAG GC CAC C CAT T C AGGG C AT AT T T AGAAT CT GAAGT T G CT AT AT CAGAG 342 0 

M II M I I I I I I I II I I I I I I I I I I I I I I I I I M I I | | | | | | | | | | | | | | | | | | | M | | | 
Db 3361 AAAT C AGAT GAAG G C CAC C CAT T C AG G G CAT AT T T AGAAT CT GAAGT T G CT AT AT CAGAG 342 0 

Qy 3421 GAATTGGTT CAGAAATACAGTAAT T CT GCT CTT GGT CAT GT GAACAGCACAATAAAAGAA 34 80 

I I I I M M I I I I I I I M I I I I I I I II I I II I I I I I I I I I | | | | | | | | | | | | | | | | | | | | | 
Db 3421 GAAT T G GT T CAGAAATACAGTAAT T C T G CT CT T G GT CAT GT GAACAGCACAATAAAAGAA 34 80 

Qy 34 81 CTGAGGCGGCTTTTCTTAGTT GAT GATTTAGTTGATTCCCT GAAGT TTGCAGTGTT GAT G 354 0 

M I II I I I I I I I I I I I I I M II I I I I I I I | | | | M I I II I I I I I || I I I | | | | || I I I I I 
Db 34 81 CTGAGGCGGCTTTTCTTAGTT GAT GATTTAGTTGATTCCCT GAAGT TTGCAGTGTT GAT G 354 0 

Qy 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

M I I I I I M I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I | I I | | | || | 
Db 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

Qy 3601 AT C T C ACT CTT C AGT AT T CCT GT TAT T TAT GAAC GGC AT C AG GT GC AGAT AGAT CAT TAT 3660 

I I I I I I I I I I I I I M I II I I I I I I II I I II I I I I I I | | | || | | | | | | | | M M I I I I I I I 
Db 3601 AT CT CACT CTT C AGT AT T C CT GT T ATT TAT GAAC G G CAT C AG GT GC AGAT AGAT CAT TAT 3660 

Qy 3661 CT AGGACTT GCAAACAAGAGT GT TAAGGAT GC C AT GGC CAAAAT C C AAGCAAAAAT C C CT 3720 

I I M II I I I I M I I I I I I I I I I I I I I I I I I I I I | | | M | | | | || | | | | | | | | | | | | | | || 
Db 3661 C T AGGACT T GCAAACAAGAGT GT TAAGGAT GCC AT GGC CAAAAT C CAAGCAAAAAT C C CT 3720 

Qy 3721 G GAT T GAAG C G C AAAG C AGAT 3741 

I I I I I I II I I I II I I I I I I I I 
Db 3721 GGATTGAAGCGCAAAGCAGAT 3741 
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DE Human cDNA encoding the Nogo protein. 
XX 

KW Human; Nogo receptor; axonal growth; immunogen; antibody; nogo protein; 

KW cranial trauma; cerebral trauma; spinal cord injury; stroke; 

KW demyelinating disease; multiple sclerosis; monophasis demyelination; 

KW encephalomyelitis ; multifocal leukoencephalopathy ; panencephalitis; 

KW Marchiaf ava-Bignami disease; pontine myelinolysis ; adrenoleukodystrophy ; 

KW Pelizaeus-Merzbacher disease; Spongy degeneration; Alexander's disease; 

KW Canavan's disease; metachromatic leukodystrophy; viral infection; 

KW Krabbe's disease; AB020693; ss. 

XX 
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XX 
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XX 
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XX 
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XX 

PR 12-JAN-2000; 2000US-0175707P . 

PR 26-MAY-2000; 2000US-02 07366P . 

PR 29-SEP-2000; 2 000US-0236378P . 
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XX 
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DR P-PSDB; AAU09453. 
XX 

PT Novel Nogo receptor protein useful for identifying modulator of Nogo 

PT protein or Nogo receptor protein, which is useful for treating central 

PT nervous system disorders. 
XX 

PS Example 1; Page 95-100; 109pp; English. 
XX 

CC The sequence (Genbank accession number AB0202693) encodes the human Nogo 

CC protein, a 250kDa myelin-associated axon growth inhibitor. The invention 

CC relates to the use of the nogo receptor, nogo protein, their nucleic 

CC acids, vectors expressing them and antibodies against them, to isolate 

CC agents which block nogo receptor mediated axonal growth. The agent is 

CC useful for treating a central nervous system disorder which is a result 

CC of cranial or cerebral trauma, spinal cord injury, stroke or a 

CC demyelinating disease selected from multiple sclerosis, monophasis 

CC demyelination, encephalomyelitis, multifocal leukoencephalopathy, 

CC panencephalitis, Marchiaf ava-Bignami disease, pontine myelinolysis, 

CC adrenoleukodystrophy, Pelizaeus-Merzbacher disease, Spongy degeneration, 

CC Alexander's disease, Canavan 1 s disease, metachromatic leukodystrophy, 

CC viral infection and Krabbe ' s disease 

XX 

SQ Sequence 4053 BP; 1189 A; 922 C; 922 G; 1020 T; 0 U; 0 Other; 

Query Match 62.6%; Score 2343.6; DB 4; Length 4053; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I II III I I Ml MINI I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 
Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 7 5 

QY 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I M II II I I I I I I I I I I MINI I I | | | | | | | M | | | | M | | | | | M | 
Db 135 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC G GAGC CC GAG GAC GAGGAGGAC GAG GAGGAG 372 

I MINI I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | M I I I I II II III 
Db 192 CAG CCCGCGTT CAAGT AC CAGT T C GT GAG GGAGC CC GAG GAC GAGGAG GAAGAAGAG 248 



Qy 373 GAGGAGGAC GAGGAGGAGGAC GACGAGGAC CTAGAGGAACT GGAGGT GCT GGAGAGGAAG 432 

I M I I I II I I I I I I I I II I I I I I I I I I I I II I I I I I II I I | | | | || I | | | | | | | 
Db 249 GAGGAG GAAGAG GAG GAC GAG G AC GAAGAC C T G GAG GAG CT GGAGGT GCT GGAGAGGAAG 308 

QY 4 33 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I I I I I I I I I I II I I I I I I I || M || | M I I I II I III 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I II I I I I I I I I I I | | | | | | || || | | | | | | | | | | | || | M M I I I I 
Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 42 8 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 5 97 

II I Ml M III I II I II I II I I II I I Mill I I I I I I I 

Db 42 9 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 488 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I I I I II I I I I II II I I I I I I I I I I I I I I I I I I I M M I I I I I I I I I I 
Db 4 89 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 54 8 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

MINIM I I I I II I I I I I II II I I I IN I I I I I I M Ml III 

Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 60 8 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

M II I I I II II I II I II II II II I M II M I I II I II 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

M M M M I M I M M I I II I I I I I I I II I II I M I I II I I I II I I I || II II I I M 
Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 728 

Qy 808 GT GAT AC CCTCCTCTG CAGAAAAAAT T AT GGATT TGAT GGAGC AGC C AGGT AAC AC T GT T 8 67 

M II I M I II 1 II II I I I I I I I MUM I I M M M I I M 

Db 72 9 GTGATACGCTCCTCTGCAGAAAA TAT GG ACT T GAAGGAGC AGC CAGGT AACAC TAT T 785 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

M I M I II II II I I I II II II I II I I I II II I II I I I I I I I || I I | | | | | || | || I II 
Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 92 8 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M M I I II II I I II I I I I I I I I I I I I II | | | | | M I I I II I I II II I I I || 
Db 84 6 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 988 GT GT CAT CCT CAGAAGGAACAATT GAAGAAACTTTAAAT GAAGCT T CTAAAGAGTT GC CA 1047 

M I I M I M I I II MUM I I I II I I II I II I I I II M I M 

Db 906 GTATTAC C CACT GAAGGAACACTT CAAGAAAAT GT CAGT GAAGCTT CTAAAGAGGT CT CA 965 

Qy 1048 GAGAGGGCAACAAAT CCATTT GTAAATAGAGATTTAGCAGAATTTT CAGAATTAGAAT AT 1107 

M I I I I M I M M I I M I I I II I II I I I MM II II M II I II II I II I 
Db 966 GAGAAG G CAAAAACT C TACT C AT AGAT AGAGATT T AACAGAGT T T T CAGAAT T AGAAT AC 1025 

Qy 1108 T C AGAAAT G GGAT CAT C T T T T AAAGG CT C C C CAAAAGGAGAGT CAGC C AT AT T AGTAGAA 1167 

M M II M II II II I I I II I I III II I I II I Ml II Ml II II I II I I 
Db 1026 T CAGAAAT GG GAT CAT C GT T CAGT GT C T CT C CAAAAGCAGAAT CT G C C GT AAT AGT AG C A 1085 

Qy 1168 AACACTAAGGAAGAAGTAATTGTGAGGAGTAAA GACAAAG AG GAT T T AGT T T GT AGT 1224 



Db 1086 AAT CCTAGGGAAGAAATAAT CGT GAAAAATAAAGAT GAAGAAGAGAAGTTAGTTAGTAAT 1145 

Qy 1225 GC AGC C C T T C AC AGT C CACAAGAAT C AC C T GTGGGTAAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 

Db 1146 AAC AT CC T T CAT AAT C AAC AAGAGT T AC C T ACAG C T C T T AC T AAAT T GGT TAAAGAGGAT 1205 

Qy 127 0 AGAGT TGTGTCTC C AGAAAAGACAAT GGAC AT TT T TAAT GAAAT GC AGAT GT CAGT AGT A 1329 

I I M I I I I I I I I I I || I Ml I I I I I I I I I I I | | M I I I I I I I I 

Db 1206 GAAGT TGTGTCTT C AGAAAAAGC AAAAGAC AGT T T TAAT GAAAAGAGAGT TGCAGT GGAA 1265 

Qy 1330 GC AC CT GT GAGGGAAGAGT AT GC AGACT T T AAGC CAT T T GAACAAGC AT G GGAAGT GAAA 1389 

II Ml I I I II I I II I I II I II II I I II I I I I I I I I I I | I I I I I I I I | M I I 
Db 1266 GC T C CT AT GAG GGAGGAAT AT GC AGAC T T CAAAC CAT T T GAGC GAGT AT G GGAAGT GAAA 1325 

Qy 1390 GAT ACT TAT GAGGGAAGT AGG GAT GT G C T GGCT GCT AGAG C T AATGTG 1437 

I I I I Mill I I I I I I II I I I II I I I I I I I || M 

Db 1326 GATA GTAAGGAAGAT AGT GAT ATGTT GGCT GCT GGAGGTAAAAT C GAGAGCAACTTG 1382 

Qy 1438 GAAAGT AAAGT G GAC AGAAAAT GC T T GGAAGAT AGCCT G GAGCAAAAAAGT CTT GGGAAG 1497 

I I M I I I I I II I I I I I I I I I I I I | | | | | | | I | | | | | | | | | |M | || 
Db 1383 GAAAGTAAAGTGGATAAAAAATGTTTTGCAGATAGCCTTGAGCAAACTAATCACGAAAAA 1442 

Qy 1498 GAT AGT GAAGGCAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC C T GT GAAG GAC 1557 

I I I I I I I I I II I I I I I Ml I I I I I I I M I I I I II I I I I I I I I I I | M I 
Db 1443 GAT AGT GAGAGT AGT AAT GAT GAT ACTT CT T T CC C C AGT AC GC C AGAAGGT AT AAAGGAT 1502 

Qy 155 8 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA CCT CAGC AAC C GAAAGCACCACA 1614 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I II I I I I II 
Db 1503 CGTT CAGGAGCATAT AT CACAT GT GCT CC CT TTAACCCAGCAGCAACT GAGAGCATT GCA 1562 

Qy 1615 GCAAACACT TT CCCTTT GTTAGAAGAT CATACTT CAGAAAATAAAACAGAT GAAAAAAAA 1674 

I M I I I III I I I I I I I I I I I I I | | M | | | | | | | | | | | || || | | | | || | | | | | | 
Db 15 63 ACAAAC AT TTTTCCTTTGT T AGGAGAT C CT ACT T CAGAAAAT AAGAC C GAT GAAAAAAAA 1622 

Qy 1675 ATAGAAGAAAGGAAGGCCCAAATTATAACAGAGAAG ACT AGC CCCAAAACGTC AAAT 1731 

Mill II I I I I I I I , I i I I I 11(111 I I I I | | | Mill 

Db 1623 ATAGAAGAAAAGAAGGCCCAAATAGTAACAGAGAAGAATACTAGCACCAAAACATCAAAC 1682 

Qy 17 32 CCTTTCCTT GT AG C AGT AC AGGAT T CT GAGGC AGAT T ATGT T ACAAC AGAT AC CTTAT C A 1791 

I I M I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I Ml || IN II 

Db 1683 CCTTTTCTT GT AGCAG CAC AGGATT C T GAGAC AGAT TAT GT C ACAAC AGAT AAT T T AAC A 1742 

Qy 1792 AAGGT GACT GAGGCAGCAGT GT CAAAC AT GC CT GAAGGT CT GAC G C C AGAT T T AGT TCAG 1851 

I I I I I M I I I I I I II Ml II M I II I M I I I I II Mill I I I I I I I I I I I III 
Db 1743 AAGGT GACT GAGGAAGT C GT G GCAAAC AT GC CT GAAGGC CT GACT C C AGAT T T AGT ACAG 18 02 

Qy 1852 GAAGCAT GT GAAAGT GAACT GAAT GAAGC C AC AGGT ACAAAGAT T GC T TAT GAAACAAAA 1911 

I I I I I I I I I I M I I II II II I I I I I I I II II I I I I I I I || I I || I I I I I I | M I I I 
Db 1803 GAAG CAT GT GAAAGT GAAT T GAAT GAAGT TACT GGT ACAAAG ATT GCT TAT GAAACAAAA 18 62 

Qy 1912 GT GGACT TGGT CCAAACAT CAGAAGCTATACAAGAAT CACTTTACC C CACAGCACAGCTT 1971 

I M I I I II I I I I I I I I II I I I I I I I I MMI I I I II II || II I II I || II I 
Db 18 63 AT G GAC T T GGT T CAAACAT CAGAAGT T AT G CAAGAGT C AC T C TAT CCT GC AGC AC AGCTT 1922 

Qy 1972 T G C C CAT CAT T T GAGGAAGC T GAAG CAACT C CGT CAC CAGT T T T G C CT GAT AT T GT TAT G 2031 
I I I I I I I I I I I I I I II Mill I I II I I I I II II II I I I M II I I I I I 



Db 1923 T GC C CAT CAT T T GAAGAGT CAGAAGCT AC T C C TT C AC C AGT TT T G C CT GACAT T GT TAT G 1982 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I M I I I I I I I II I I I I I I II I I I I | I | M I I I ! I I II I I I I I | 

Db 19 83 GAAG CAC CAT T GAAT T CT GCAGT T C CT AGT GCTGGTGCTTCCGT GAT ACAGC C CAGC T C A 2 042 

QY 2092 TCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATT^AAGCTTGAGCCTGAAAAC 2151 

II I I I I I I I I II II I I | | | | | | | | | | || | | | | | | | | | | | | | | | | M I 

Db 2043 T CAC CAT T AGAAG C T T CT T C AGTT AAT TAT GAAAGC AT AAAAC AT GAG C CT GAAAAC 209 9 

Qy 2152 CCCC CAC CAT AT GAAGAAGCCAT GAAT GTAGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I M I M M I I I I I I II I II I I II I I I I I I I | | | | | | | I I M I I I I I I I 
Db 2100 CCCC CAC CAT AT GAAGAGGCCAT GAGT GTAT CACTAAAAAAAGTAT CAGGAATAAAGGAA 2159 

Qy 2209 GGAATAAAAGAGC CT GAAAGTT T TAAT G C AGCT GT T CAG GAAAC AGAAGCT C CT T AT AT A 2268 

I Ml M I I I I M I I I I I I I I I I I M II I I I I I I I I I I I I I I | || || | | | | | | | | 

Db 2160 GAAATTAAAGAGCCT GAAAATATTAAT GCAGCT CTT CAAGAAACAGAAGCT CCTTATAT A 2219 

Qy 22 69 T C CAT T G C GT GT GAT TTAAT T AAAGAAACAAAG C T C T C C ACT GAGC C AAGT C C AGAT T T C 232 8 

M I I I I I I I M I I I I M I I I I I I I I II I I I I II II MM III III I I II M 
Db 2220 T C TAT T G CAT GTGAT T TAAT TAAAGAAACAAAG CT T T CT G CT GAACC AGC T C C GGAT T T C 2279 

Qy 232 9 T CTAATT ATT C AGAAAT AGCAAAAT T C GAGAAGT C G GT GC C C GAAC AC GC T GAGCTAGT G 2388 

II I M I I I I I I I M I I M I II II I 

Db 2280 T CT GAT TAT T C AGAAAT GGCAAAAGT T GAAC AG C CAGT GC CT GAT C ATT CT GAGCTAGT T 2339 

Qy 238 9 GAG GAT T C C T CAC C T GAAT C T GAAC CAGT T G AC T TAT T T AGT GAT GAT T C GAT T C C T G AA 24 4 8 

II I I I I I M I M M II I I I M II I II I || I || II I M II II I I II II I II II I I I 

Db 234 0 GAAGAT T C CT C AC CT GAT T C TGAAC C AGT T GACT TAT T T AGT GAT GATT CAAT AC CT GAC 2399 

Qy 24 4 9 GTC C CACAAACACAAGAGGAGGCT GT GAT GCT CATGAAGGAGAGT CT CACTGA A 2 502 

I I I I I I M I M I I M M II II II I I I I I II I I I I || II II II M I 
Db 2400 GT T C CAC AAAAACAAGAT GAAAC T GT GAT GCT T GT GAAAGAAAGT CT C ACT GAGACTT C A 2459 

Qy 2503 GT GT CT GAGACAGTAGC CCAGCACAAAGAGGAGAGACTTAGT GC CT CAC CT CAGGAGCTA 2562 

I I IN I Ml M II I II I M II I I II I II I I 

Db 24 60 T T T GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT C AGT GC T T T GC C AC CT GAGGGA 2519 

Qy 2563 G GAAAGC CAT AT T T AGAGT C T T TT C AGC C CAAT TTACAT AGT ACAAAAGAT GC TGCA 2619 

M I I I I I I I II I II II I I I I II III II II I I Ml | | M I M I I I I 
Db 2520 GGAAAG C CAT AT T T GGAAT C TT TTAAG CT CAGT T TAGATAAC ACAAAAGAT AC C C T GT T A 2579 

Qy 2 620 T CTAAT GACATT CCAACATT GACCAAAAAGGAGAAAATTT CTTT GCAAATGGAAGAGTTT 2 679 

II I M I II II II II II I II I II I II II M I II I I I II I II I I II I III I 

Db 2580 C CT GAT GAAGT T T CAAC AT T GAGCAAAAAG GAGAAAAT T C CT T T GCAG AT G GAGGAGC T C 2639 

Qy 2 68 0 AAT AC T GCAAT T TAT T CAAAT GAT GAC T T ACTT T CTT CT AAGGAAGAC AAAATAAAAGAA 27 39 

I I I I I I M I I M I M I I M I M I I M I I I I II I I I II I I II I I I II I I II 
Db 2640 AGT ACT GCAGT T TAT T CAAAT GAT GAC T TAT T T AT TT CT AAGGAAGC ACAGAT AAGAGAA 2 699 

Qy 2740 AGT GAAACAT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT C 2799 

I M II II II I II || || || || || || M | M M M I II I II II | | || || II 
Db 27 00 AC T GAAAC GT T T T C AGAT T CAT C T CC AAT T GAAAT TAT AGAT GAGT T C C CT AC AT T GAT C 2759 

Qy 2 800 AGT G C T AAAGAT GAT T C T C C T AAAT T AGC CAAGGAGT AC ACT GAT C T AGAAGT AT C C 28 56 

III I I I I I MUM I M II II I II II | | | || | | | | | || || II II 

Db 2760 AGT T CT AAAAC T GAT T CAT T T T CTAAAT TAG C C AGG GAAT AT ACT GAC CT AGAAGT AT C C 2819 



Qy 2 857 GACAAAAGT GAAAT T G C T AAT AT CCAAAG C G GGG CAGATT CAT T GC CT T GCT T AGAATT G 2916 

I M I I I I I I I I I I I | | | | M II I I I I I I I II I I I I I I I I | I M I I I 

Db 2 82 0 CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 2879 

Qy 2917 C C CT GT GAC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAG TACATGTTTCA 2 970 

I I I I i I I I I I I I I I I I I I I I I II I I I I I I I I I I I I | | | | | | 

Db 288 0 C C C CAT GAC CTTTCTTT GAAGAACATACAAC C CAAA.GT T GAAGAGAAAAT CAGT TT CT C A 2 939 

Qy 2 971 GAT GAAT T C T C C GAAAATAGGT C CAGT GT AT CT AAG GC AT C CAT AT C G C CT T CAAAT GT C 3030 

M I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 294 0 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2999 

Qy 3031 TCTGCTTT GGAAC CT C AGAC AGAAAT GG GC AG C AT AGT TAAAT CCAAAT C AC TT AC GAAA 3090 

I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I | | | Ml M I I 
Db 3000 TCTGCTTT GGC C ACT CAAGC AGAGAT AGAGAG CAT AGT TAAAC CCAAA.GTT C T T GT GAAA 3059 

Qy 3091 GAAG C AGAGAAAAAAC TTCCTTCT GAC AC AGAGAAAGAGGAC AGAT C C CT GT CAG CT GT A 3150 

I I I I I I M I I I I I I I I II I I I I II I I I I I I I I II M I I I I I I I I | | Ml | | 
Db 3060 GAAG CT GAGAAAAAACT T C CT T C C GAT AC AGAAAAAGAGGAC AGAT CAC CAT CT GCT AT A 3119 

Qy 3151 T T GT C AGC AGAGC T GAGTAAAAC T T CAGT T GT T GAC CT C CT C TACT GGAGAGAC AT T AAG 3210 

I I I I M I I I I I I I I I II I I I I I I II I I I I I I II I I I I I I I I I I M I I I I I I M 

Db 3120 T T TT C AGC AGAGCT GAGTAAAAC T T CAGT T GT T GAC CT C CT GT ACTGGAGAGAC AT T AAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

I I I M II I I M I I I I II I I I I I I I I || I I I I I I M I I I I I II I I I i I I I M II I I 
Db 318 0 AAGAC T GGAGT GGT GT T T G GT GC C AG C CT AT TCCTGCTGCTTT CAT T GAC AGT AT T C AGC 3239 

Qy 3271 ATT GT CAGT GTAACGGCCTACATTGCCTT GGC CCT GCT CTC GGT GAC TAT CAGCTTTAGG 3330 

I I I I I II I I I I I I I I I M I I I I I I I I II I I I I I I I I II I I I I 1 I I M II I I I I II 
Db 3240 ATT GTGAGCGTAACAGCCTACATTGCCTT GGC CCT GCT CTCTGTGACCAT CAGCTTTAGG 3299 

Qy 3331 ATAT ATAAGGGCGTGAT C CAGGCT AT CCAGAAAT CAGATGAAGGCCACC CATT CAGGGCA 3390 

Mill I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I | | | | 
Db 3300 AT AT ACAAGGGT GT GAT C CAAGCT AT C CAGAAAT C AGAT GAAGGC CAC C CATT CAGGGCA 3359 

Qy 3391 TAT T T AGAAT C T GAAGT T GCT AT AT C AGAGGAAT T G GT T CAGAAAT AC AGTAAT T CT GC T 34 50 

Ml I I I I I I I I II I I I I M I M I I I I II I I I II I II II I I I I II I II I II I I I I I 
Db 3360 T AT CT GGAAT CT GAAGT T GCT AT AT C T GAGGAGT T GGT TCAGAAGT AC AGTAAT T CT G CT 3419 

Qy 34 51 CT T G GT CAT GT GAAC AGCACAAT AAAAGAACT GAGGC GGC TT T T C T T AGT T GAT GATT T A 3510 

I I M I I I I I I I II I I MM Mill II I I I I I II I II || I I I II I II II I I II M 
Db 3420 CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 34 7 9 

Qy 3511 GTT GATT CCCTGAAGTTTGCAGTGTT GAT GTGGGTGTTTACTTATGTT GGT GCCTTGTTC 3570 

I I I I I I I I I I I M II II II II I I I I II II I I II I I II || I || I II II I II II I M I 
Db 3480 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

Qy 3571 AAT G GT CT GAC ACT AC T GAT T T T AGC T CT GAT CT CACT CT TC AGT AT T C CT GT T AT T TAT 3630 

I I I I I I I I I I M II II II I I I I || I II || I | II I II M II II M II I I II 

Db 3540 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3631 GAAC G GC AT CAGGT GC AGAT AGAT CAT TAT C TAG GACT T G CAAAC AAGAGT GT T AAG GAT 3690 

M II M I II I II I I II II I I I I II II I I I I I || M I II II I I I I II II II II III 
Db 3600 GAAC G G CAT CAG G CAC AGAT AG AT CAT TAT C TAG GAC T T G CAAAT AAGAAT G T T AAAGAT 3659 



Qy 3 691 G C CAT GGC C AAAAT C CAAGCAAAAAT C C CT GGAT T GAAG CGCAAAGC AGA 3740 

M Mill I II I II II II I I I II I I I I | | | | | | | | | | | | | | | | | | | || 
Db 3 660 GCTATGGCTAAAATCCAAGCAAAAATCCCTGGATTGAAGCGCAAAGCTGA 3709 



RESULT 4 
ACC81048 

ID ACC81048 standard; cDNA; 4053 BP. 
XX 

AC ACC81048; 
XX 

DT 22-JUL-2003 (first entry) 
XX 

DE Human NogoA gene. 
XX 
KW 
KW 



PT 
PT 



Human; Nogo receptor; NgR; CTS domain; neuroprotective; gene therapy; 
axonal growth; central nervous system; CNS; Nogo; spinal cord injury; 



KW cranial trauma; cerebral trauma; spinal trauma; stroke; Krabbe's disease 
demyelinating disease; multiple sclerosis; monophasic demyelination; 
encephalomyelitis; multifocal leukoencephalopathy; panencephalitis; gene 



KW 
KW 

KW ss. 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 135. .3713 

FT /*tag= a 

FT /product= "Human NogoA" 

XX 

PN WO2003031462-A2 . 
XX 

PD 17-APR-2003. 
XX 

PF 04-OCT-2002; 2002WO-US032007 . 
XX 

PR 06-OCT-2001; 2001US-00972599 . 
XX 

PA (UYYA ) UNIV YALE. 
XX 

PI Strittmatter SM; 
XX 

DR WPI; 2003-393433/37. 
DR P-PSDB; ABR59667. 
XX 

PT New human Nogo receptor polypeptides and nucleic acids, useful for 
decreasing inhibition of axonal growth by a central nervous system 
neuron, or in treating central nervous system disease, disorder or 
PT injury, e.g. spinal cord injury. 
XX 

PS Disclosure; Page 126-131; 148pp; English. 
XX 

CC The invention relates to a novel nucleic acid encoding a polypeptide 
CC comprising amino acid residues 27-309 of a 473 amino acid sequence (PI, 
CC human Nogo receptor (NgR) NTLRRCT domain), or residues 27-309 of Pl with 
CC 1-20 conservative amino acid substitutions, and less than a complete CTS 
CC domain, provided that a partial CTS domain, if present, consists of no 
CC more than the first 39 consecutive residues. The nucleic acid of the 



CC invention has neuroprotective activity. The polynucleotide may have a use 

CC in gene therapy. The nucleic acid is useful for decreasing inhibition of 

CC axonal growth by a central nervous system (CNS) neuron. The NgR 

CC polypeptide or an agent inhibits the binding of Nogo to NgR or NgR- 

CC dependent signal transduction in the central nervous system neuron may be 

CC used in treating central nervous system disease, disorder or injury, e.g. 

CC spinal cord injury. Expression of an NgR protein may be associated with 

CC inhibition of axonal regeneration following cranial, cerebral or spinal 

CC trauma, stroke or a demyelinating disease, such as multiple sclerosis, 

CC monophasic demyelination, encephalomyelitis, multifocal 

CC leukoencephalopathy, panencephalitis, or Krabbe ' s disease. The present 

CC sequence is used in the exemplification of the invention 

XX 

SQ Sequence 4053 BP; 1189 A; 922 C; 922 G; 1020 T; 0 U; 0 Other; 



Query Match 62.6%; Score 2343.6; DB 8; Length 4053; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21; 

QY 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACT^ACCGCCCGCGACT 193 

I N III I I III I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 75 

QY 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I M I M I I I I II II I I I I I I I I I I I I I I I I I I M 

Db 7 6 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 2 53 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I ,111111 I M I I I I I II I I I I I I I I 
Db 135 AT GGAAGAC C T GGACCAGT CTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC G GAGC C C GAG GACGAG GAG GAC GAGGAGGAG 372 

I I M I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M | | || || Ml 
Db 192 C AGC C C G C GT T CAAGT AC C AGT T C GT GAGGGAGC C C GAG GACGAGGAG GAAGAAGAG 24 8 

Qy 373 GAGGAGGAC GAGGAGGAGGACGACGAGGACCTAGAGGAACT GGAGGT GCT GGAGAGGAAG 432 

I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I | | | || || || || | | | | M | | | | | 
Db 249 GAGGAGGAAGAGGAGGACGAGGAC GAAGACCTGGAGGAGCT GGAGGT GCT GGAGAGGAAG 308 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 48 6 

I M I I I I I I I I II I I M I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I I I I I I I I II I I I I I II II I I I II I I I || | | I I I I I I I I I | | I I | 
Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 428 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I I I I I I I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I 

Db 429 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 4 88 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I M I I I I I I I II I I I I I I I I I M | I I I I I I I M I I I I I I I I I I I I I I I | I I I I 
Db 489 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 54 8 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I M II I I I I II I I I II || || IN Ml || | | | | | | Ml Ml 



Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

M I I I I I I I I I I I I Mill I I I I I I I I I I I II I I I I I 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I M I I I M I I I II I I I I I I I I I || || | | I I | I | | | | | | | | | | | | | | | M | | | M | | | 
Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 728 

Qy 8 08 GT GATAC CCT C CT CT GC AGAAAAAATT AT G GAT T T GAT G GAG CAGC C AGGTAAC AC T GT T 8 67 

I I I I I I II I I I I I I I I I I I I I | I I I I I I I M | | | | | | | || | | || M M I I I I 
Db 729 GT GATAC GCT CCT CTGCAGAAAA TAT GGAC T T GAAGGAGCAGC C AGGTAAC AC TAT T 785 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml M II I II I II I I I I II I I I I I I | I I | M | I | | | | | | | | | | | | | I I I I I I I I I I I I 
Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 92 8 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I || M II II III II 
Db 846 CTGTCTCCTCTCTCAGCCGCTTCTTTC7WVGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 988 GT GT CAT C CT CAGAAGGAACAAT T GAAGAAAC T T T AAAT GAAG C T T CTAAAGAGT T GC C A 1047 

M I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I M I I II 
Db 906 GT AT T AC C C ACT GAAGGAACACT T CAAGAAAAT GT CAGT GAAG CT TCT AAAGAGGT CT C A 965 

Qy 104 8 GAGAGGGCAACAAAT C CAT T T GTAAAT AGAGATT T AGCAGAAT T T T C AGAAT T AGAAT AT 1107 

M I I I I I I I II II I I II I I || || I I I | | | | | | | | || | | | | | | | M I I M 
Db 966 GAGAAGG CAAAAAC T CT ACT C AT AGAT AGAGAT T T AACAGAGT TT T C AGAAT T AGAAT AC 1025 

Qy 1108 T CAGAAAT G GGAT CAT C T T T T AAAG GC T C C C CAAAAGGAGAGT CAGC CAT AT T AGT AGAA 1167 

I I I I i I I I M I II I I I I II I I I I I I I I I I I I III || Ml || | | | | | | | 
Db 1026 T CAGAAAT GGGAT CAT C GT T CAGT GT CT C T C CAAAAGCAGAAT CT G C C GT AAT AGTAGCA 1085 

Qy 1168 AACACTAAG GAAGAAGTAAT T GT GAGGAGTAAA GACAAAGAGGATT T AGT T T GT AGT 1224 

M M I I I M I I I I I II I I || | | | | II I I I I I 

Db 1086 AAT C C TAGGGAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGTT AGT T AGT AAT 1145 

Qy 1225 GCAG C C CT T C AC AGT C C ACAAGAAT CAC CT GT G G GT AAAGAAG AC 1269 

I I I I II I I I I I I I I I I I I I I | M I I I I I I I I 

Db 1146 AAC AT C CT T CAT AAT CAACAAGAGT T AC CT AC AGCT C T T AC T AAAT T GGTT AAAGAG GAT 1205 

Qy 127 0 AGAGT T GT GT C T C CAGAAAAGACAAT GGACAT T T T T AAT GAAAT G C AGAT GT CAGT AGT A 1329 

I I I I I M II I I I I I I I I III I I I I I I I I I I I I | | | | | I I I I I I 

Db 12 06 GAAGT T GT GT CT T C AGAAAAAG C AAAAGACAGT T T T AAT GAAAAGAGAGTT GC AGT GGAA 12 65 

Qy 1330 GC AC CT GT GAG GGAAGAGT AT GC AGACT T TAAGC CAT T T GAACAAGC AT GG GAAGT GAAA 138 9 

M Ml I M I I I I II I I I I II I I I I I II I I I I I I I I I || I I I I I I I I II I I I 
Db 1266 G CT C CT AT GAGGGAGGAAT AT GC AGACT T CAAAC CAT T T GAGC GAG TAT GGGAAGT GAAA 1325 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 14 37 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I || M 

Db 1326 GATA GTAAG GAAGAT AGT GAT AT GT T GGC T GC T GGAG GT AAAAT C GAGAGC AACT T G 1382 

Qy 1438 GAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGGAAG 14 97 

I I I I I M I I I I I I I I I | | M I I I I I I I I I I I I I I I I I M I III I II 
Db 1383 GAAAGT AAAGT G GATAAAAAAT GT T T T G C AGAT AG C C T T GAGC AAACT AAT CAC GAAAAA 1442 



Qy 1498 GATAGT GAAGGC AGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGAC 1557 

I I I I I I I I I II I I I I I II I I I I I I II I I I I I I I I I I M I I I I I Mill 
Db 1443 GATAGT GAGAGT AGT AAT GAT GAT AC TTCTTTCCC CAGT AC G C CAGAAG GT AT AAAGGAT 1502 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA CCTCAGCAACCGAAAGCACCACA 1614 

I I I M I I II || | I M II I I II I I I I || 

Db 1503 C GT T CAGGAG CAT AT AT C AC AT GT G CT C C C T TT AAC C C AG CAGCAACT GAGAGC ATT GC A 1562 

Qy 1615 GCAAAC ACTT T C C CT T T GT T AGAAGAT CAT AC T T CAGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I I I I I I Ml M I I I I I I I I Mill I I I I I I I I I I I I I I I || I I II II I I I I I I 
Db 1563 ACAAACATTTTT C CTTT GTTAGGAGAT C CTACTT CAGAAAAT AAGAC C GAT GAAAAAAAA 1622 

Qy 1675 AT AGAAGAAAG GAAG G C C C AAAT T AT AAC AG AGAAG ACTAGCCCCAAAACGTCAAAT 1731 

I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Nihil | | | | | 
Db 1623 AT AGAAGAAAAGAAG G C C CAAAT AGTAACAGAGAAGAAT ACTAGCAC CAAAAC AT CAAAC 1682 

Qy 1732 CCTTTCCTT GT AG C AGT ACAG GAT T CT GAG G CAGAT TAT GTT ACAACAGAT AC CT T AT C A 1791 

Mill II II I I II II II I II II I M II I II II I II I I I II II II I II I Ml II 
Db 1683 CCTTTT CTT GT AGCAGCACAGGATT CT GAGACAGATTAT GT C AC AAC AG AT AAT TT AAC A 1742 

Qy 1792 AAGGT GACT GAG GCAGC AGT GT CAAAC AT GC CT GAAGGT CT GACGC CAGAT T T AGT T C AG 1851 

I M II I II II II I II III I I II I I II II I I II I I II I II III 

Db 1743 AAGGT GACT GAGGAAGT C GT G G CAAAC AT GC C T GAAGGC CT GACT C CAGAT T T AGTAC AG 1802 

Qy 18 52 GAAGC AT GT GAAAGT GAACT GAAT GAAGC C ACAGGT ACAAAGATT GCT T AT GAAACAAAA 1911 

M M I I II II II II M II II I I I I II I II I I I II I I II II I II II II I I II I I I I I 
Db 18 03 GAAGCAT GT GAAAGTGAAT T GAAT GAAGTTACT GGT ACAAAGATT GCTTAT GAAACAAAA 18 62 

Qy 1912 GT GGACT T GGT C CAAACAT C AGAAGCT ATAC AAGAAT CAC T T T AC CC C AC AG C AC AGCT T 1971 

M I M I II I I I II II II II I I I I III I II I I II I II II II || || | M II II 
Db 18 63 AT G GAC T T GGT T CAAACAT CAGAAGT TAT G C AAGAGT C ACT CT AT CCTG C AGC AC AGCT T 1922 

Qy 1972 T G C C CAT CAT T T GAGGAAG CT GAAGCAACT C C GT CAC CAGT T T T GC C T GAT AT T GT TAT G 2031 

M M II II II II II II I I I I II I II II I II I I I II II II II II I II I II I I II 
Db 1923 T GC C CAT CAT T T GAAGAGT CAGAAGCT AC T C C T T CAC CAGT T T T GCCT GAC AT T GT TAT G 1982 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

M I I I I I I I II II II II I I M II II I I II II I I I I I | | || || II I I 

Db 1983 GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 2042 

Qy 2092 T C C C C ACT GGAAG CAC C T CCT C CAGT T AGT TAT GAC AGT ATAAAG CT TGAGC CT GAAAAC 2151 

M Ml I I M I II II I I II II II I I I I M I I II I I II II I I I II I I I I 
Db 2043 T CAC CAT T AGAAG CT T CTT CAGT TAAT TAT GAAAGC ATAAAAC AT GAGC C T GAAAAC 2099 

Qy 2152 C C C C CAC CAT AT GAAGAAGC CAT GAAT GT AGC ACT AAAAGCTTTGGGAACAAAGGAA 220 8 

I I M II II I I I I II II I I II I I I I II I I I I I I I I II I I MM II I I I I I 
Db 2100 C C C C CAC CAT AT GAAGAG GC CAT GAGT GT AT CACTAAAAAAAGT AT C AG GAATAAAGGAA 2159 

Qy 2209 GGAAT AAAAGAG CCT GAAAGTT TT AAT G CAGC T GT T CAGGAAAC AGAAGCT C CT T AT AT A 2268 

I M I I M I I I II I I I II I I I II I I II II I I II I II II II I I II I II II II I I II 
Db 2160 GAAATTAAAGAGC CT GAAAATATTAAT GCAGCT CTTCAAGAAACAGAAGCT CCTTATATA 2219 

Qy 2269 T C CAT T G C GT GT GAT T TAAT TAAAGAAACAAAG CT CT C C ACT GAG C CAAGT C CAGAT T T C 2328 

M M M I I M II II I I I II I I I I I I I M II I I I II I I II III III II I M I 
Db 2220 T CT AT T GCAT GT GAT T TAAT T AAAGAAACAAAGC T T T CT G CT GAAC C AG C T CC G GATT T C 22 7 9 



Qy 2329 T CT AAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT C G GT GC C C GAAC AC GCT GAG C T AGT G 2388 

Ml I I I I I I I I I I I I I I I I I I I I II II I II I I I II || | | | || I | | | | 
Db 2280 T CT GAT TAT T C AGAAAT G GCAAAAGT T GAAC AGC C AGT G C CT GAT CAT T CT GAGC T AGT T 2339 

Qy 2389 GAGGATT CCT CAC CT GAATCT GAAC CAGTTGACTTAT TTAGTGAT GATT CGATT C CT GAA 244 8 

I I I I M I I I I I I I I I I I I I M I I I I II I I I I M I II I I I I I II I I I II II I I I I I 

Db 2340 GAAGAT T C CT CAC CT GAT T CT GAAC CAGTT GAC T TAT T T AGT GAT GAT T CAAT AC CT GAC 2399 

Qy 2449 GT CC C ACAAAC AC AAGAG GAGG C T GT GAT GCT CAT GAAGGAGAGT C T CAC T GA A 2502 

I I I I I I I I I I M I I I I I I I I I I I II I I I I I I M I I I I I I I I I I I | 
Db 2400 GT T C C ACAAAAACAAGAT GAAAC T GT GAT GC T T GT GAAAGAAAGT C T C ACT GAGAC TT C A 2459 

Qy 2503 GT GT CT GAGACAGT AGCC CAGC ACAAAGAGGAGAGACT TAGTGC CT CACCT CAGGAGCTA 2562 

I I IN I III I I II I I I I I I I I I I I I I I M | 

Db 24 60 T T T GAGT CAAT G AT AGAATAT GAAAAT AAG GAAAAACT CAGT GC T T T GC C AC CT GAGGGA 2519 

Qy 2563 G GAAAG C CAT AT T TAG AGT C T T T T C AG C C CAAT T T AC AT AG T AC AAAAG AT G C TGCA 2 619 

I I I I I M I I I I I I I I I I I I I || I I I || I | | | || | I I I I I M I I I I 
Db 2520 GGAAAGC CAT AT T T GGAAT C TT T T AAG C T C AGT T T AGATAAC ACAAAAGAT AC C C T GT T A 2579 

Qy 262 0 T CTAAT GAC AT T C CAACAT T GAC CAAAAAG GAGAAAAT TT CT T T G CAAAT GGAAGAGT T T 2 67 9 

M I I I I M I I M I I I I I I I I I II I I I I I I II I I I I I I I II I I I I I III I 
Db 2580 CCT GAT GAAGT T T CAACAT T GAGC AAAAAGGAGAAAAT T C CT T T GC AGAT G GAGGAGC T C 2639 

Qy 2680 AAT ACT G CAAT T TAT T CAAAT GAT GAC T T AC T T T C T T CT AAGGAAGACAAAAT AAAAGAA 2739 

I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I MM I I I I 
Db 2640 AGTACTGCAGTTTATTCAAATGATGACTTATTTATTTCTAAGGAAGCACAGATAAGAGAA 2699 

Qy 274 0 AGT GAAAC AT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AG AT GAAT T T C C C AC GT T T GT C 27 99 

I I I M I I M II II II II II I I I || || I I I || M II | || | || || || || || 
Db 2700 ACT GAAAC GT T T T C AGATT CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C CT ACAT T GAT C 2 759 

Qy 2 8 00 AGT GCT AAAGAT GAT T C T CCTAAATTAGCCAAGGAGTACACT GAT CT AGAAGTAT CC 2 8 56 

I I I I I I M I I II I I I II I II I II I II I III II I II II II M II II I I I I 
Db 2760 AGT T CTAAAACT GAT T CAT TTT CTAAATTAGC CAGGGAAT AT ACT GACCTAGAAGTAT C C 2819 

Qy 2 857 GAC AAAAGT GAAAT T G C T AAT AT C C AAAG C GGGGCAGAT T CAT T G C CT T GCT T AGAAT T G 2916 

I I M I II M M I I I II II II I I I I II 

Db 2820 CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 28 7 9 

Qy 2917 C C CT GT GAC CTTTCTTT C AAGAAT AT AT AT C CT AAAGAT GAAG T ACAT GT TT CA 2970 

Ml I M I I I I I M M I II I I I I I I I I II II II I II I I I I II 

Db 2 880 C CC CAT GACCTT T CT TT GAAGAACATACAACCCAAAGTT GAAGAGAAAAT CAGTTT CT CA 2939 

Qy 2971 GAT GAAT T CT C C GAAAAT AGGT C CAGT GT AT CT AAGG C AT C CAT AT C GC CT T CAAAT GT C 3030 

I I M I II II II I I I I II I I I II II II I M I II I I II II I I 

Db 294 0 GAT GACT T T T CT AAAAAT G G GT CT GC TACAT CAAAGGT GCT CT T AT T G CC T C C AGAT GT T 2999 

Qy 3031 TCTGCTTT GGAAC CT C AGACAGAAAT G G GC AGCAT AGT T AAAT C CAAAT C ACT T AC GAAA 30 90 

I I I I I II II I II I I I II I II I I || || || || I I I | | | || | | | | | || 
Db 3000 TCTGCTTT G GC C AC T CAAG CAG AG AT AGAGAGC AT AGT T AAAC C CAAAGT T C T T GT GAAA 305 9 

Qy 3091 GAAG C AGAGAAAAAAC T T C C T T CT GACAC AGAGAAAGAG GAC AGAT C C CT GT CAG C T GT A 3150 

I M I I II I I II I I II I | | M I I II II II I I M I II I I II II II I || || I II 
Db 3060 GAAGCT GAGAAAAAACTTCCTTCCGATACAGAAAAAGAGGACAGAT CACCATCTGCTATA 3119 

Qy 3151 T T GT CAG CAGAG CT GAGT AAAACT T CAGT T GT T GAC C T C CT C T AC T G G AGAGACAT T AAG 3210 



Db 312 0 T T T T C AG CAGAG CT GAGT AAAAC TT C AGT T GT T GAC C T C CT GT ACT G GAGAGAC AT TAAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

I I I I I I I M I I I I I I I II I II II I I I I II I I I I I I I I I I I II I I I II I I I I | I I I 
Db 318 0 AAGAC T G GAGT GGTGTTTGGTGC C AGC C TAT TCCTGCTGCTTT CAT T GAC AGT AT T CAGC 3239 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I II I I I I I I II I II I I II I II I I I I I I | | | | | | | | M II I I I I I I I I I I I I 
Db 3240 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3299 

Qy 3331 AT AT AT AAGGGC GT GAT C C AG G CT AT C C AGAAAT C AGAT GAAGGC CAC C CAT T C AGGGC A 3390 

I I I I I INN I I I M I I I I I I I I I I M I I I I I I | | | | | | | | | | | || | | | | | | U | | | 
Db 3300 AT AT ACAAGGGT GT GAT C CAAG CT AT C CAGAAAT CAGAT GAAGGC CAC C CAT T C AGG GC A 335 9 

Qy 3391 TAT T TAGAAT C T GAAGT T GC TAT AT C AGAGGAAT T GGT T CAGAAAT AC AGT AAT T CT GCT 3450 

HI I I M I I I I I I I M I I I I I I I I Mill I I I I I I I I I II I I I | | | | | | | || | | | 
Db 3360 TAT C T GGAAT CT GAAGT T GC TAT AT CT GAG GAGT T GGT T CAGAAGT AC AGTAAT T C T GCT 3419 

Qy 3451 CT T GGT C AT GT GAACAG CACAATAAAAGAAC T GAGGCGGCT T T T C TT AGT T GAT GATTT A 3510 

M I I I II I I I M I I I I I II I II I I I I I I I I I I M I I I I II I I I I 

Db 342 0 C T T GGT CAT GT GAACT G CAC GAT AAAGGAAC T CAGGCGC C T C T T C TT AGT T GAT GAT TT A 34 79 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

M I I I I I I II I I I I I I I II I I I M I I I I I I I I I II I I I I I I | | | | | | | | | 

Db 3480 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

Qy 3571 AAT GGT C T GAC ACT ACT GAT TTT AG C T CT GAT CT C ACT CT T C AGT AT T C CT GT TAT T TAT 3630 

I I M I I I I I I I I I I I I I I I I I I I 111(1 || I I I I I I I I II I I I I I I I I II I I I I I I 
Db 354 0 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3631 GAAC G G CAT CAGGT G CAGAT AGAT CAT T AT CT AG GACT T GCAAACAAGAGT GT T AAGGAT 3690 

I I I I I M I I I I I I I I I I I I M I I I I I I I I I II I || I || II I I I I I I I I || I I II I 

Db 3600 GAAC G GC AT C AGGC ACAGAT AGAT CAT TAT CT AG GACT T GCAAAT AAGAAT GT T AAAGAT 3659 

Qy 3691 GC CAT GGC CAAAAT C CAAGC AAAAAT C C CT GGAT T GAAGC GCAAAGC AGA 3740 

II I I I M I I M I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I || 

Db 3660 G C TAT G G CTAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC G C AAAGC T GA 3709 
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XX 

PI Benson DR, Kalos MD, Lodes MJ, Persing DH, Hepler WT, Jiang Y; 
XX 

DR WPI; 2002-627435/67. 

DR P-PSDB; ABP68600. 
XX 

PT New isolated polynucleotide and pancreatic tumor polypeptides, useful for 

PT diagnosing, preventing and/or treating cancer, particularly pancreatic 

PT cancer. 
XX 

PS Claim 1; SEQ ID NO 53; 300pp 4- Sequence Listing; English. 
XX 

CC The invention relates to an isolated polynucleotide (I) comprising: (a) 

CC any of a group of over 4000 nucleotide sequences (ABV94 628-ABV99145 ) ; (b) 

CC complements of (a); (c) sequences consisting of at least 20 contiguous 

CC residues of (a); (d) sequences that hybridize to (a), under moderately 

CC stringent conditions; (e) sequences having at least 75% or 90% identity 

CC to (a); or (f) degenerate variants of (a). Polypeptides (ABP68596- 

CC ABP68637) encoded by (I) and oligonucleotide can be used to detect cancer 

CC in a patient and compositions comprising polypeptides, polynucleotides, 

CC antibodies, fusion proteins, T cell populations and antigen presenting 

CC cells expressing the polypeptide are useful in treating pancreatic cancer 

CC and stimulating an immune response. The polynucleotides can be used as 

CC probes or primers for nucleic acid hybridisation, in the design and 

CC preparation of ribozyme molecules for inhibiting expression of the tumour 

CC polypeptides and proteins in the tumour cells, in vaccines and for gene 

CC therapy. Note: The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 
XX 

SQ Sequence 4632 BP; 1398 A; 1013 C; 1011 G; 1210 T; 0 U; 0 Other; 

Query Match 62.6%; Score 2343.6; DB 6; Length 4632; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21; 

QY 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

Ml I I I I I I I I I I I I M I I II II I I I I I I I I II 

Db 23 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 82 



CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 
NMI I M I I I I I I I II I I I I I I I I M I I I I I I I I I I | I | I I II 



Db 83 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 141 

Qy 253 AT GGAAGAC AT AGAC C AGT C GT C G C T GGT CT C CT C GT C C AC GGAC AG CCCGCCCCGGCCT 312 

I I I N I I II II I II I I I MM II I II II I II I 

Db 142 AT GGAAGAC CT GGAC CAGTCTCCTCT GGT CTCGTCCTCGGACAGCCCACCCCGGCCG 198 

QY 313 C C GC C C GC C T T CAAGT AC C AGT T C GT GAC GGAGC C C GAG GAC GAGGAGGAC GAGGAGGAG 372 

I MIMI I I II I I I I I I I II II II II II I I I I II I M I I I II II II I 

Db 199 C AGC C C GC GT T CAAGT AC C AGT T C GT GAG GGAGC C C GAG GAC GAGGAG GAAGAAGAG 255 

Qy 373 GAGGAGGAC GAG GAGGAG GAC GAC GAG GAC C T AGAGGAAC T G GAGGT GCT G GAGAGGAAG 432 

I I I II I I I II I I I I I I M Mill I I I I I Mill I I II M I I I II I M II I II II 

Db 256 GAGGAGGAAGAGGAGGACGAGGACGAAGACCTGGAGGAGCTGGAGGTGCTGGAGAGGAAG 315 
Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I I I II I II II I I I I I I I M I II I I I I I M I I I M II I II I M I 

Db 316 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 375 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I M II I I I I I II I M I I II I I ! I M I I II II I II M I II II I II I 

Db 376 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 435 

Qy 54 7 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I I M I I M I I I II II I II M I I I I I I II 

Db 4 36 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 4 95 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I II II II I M I I I I I I M I I I I I I M I II II I II I I I II I I I II II II I II 

Db 4 96 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 555 
Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M I I II II II II II I II I I M II Ml Ml II I I II II III III 

Db 556 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 615 
Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 75 0 

M II II II I M II I Mill II II II M II II I II II I 

Db 616 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 675 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I II II I I I II I I I II II I II II II I I I II II II II || II II I I II I I II I I 

Db 676 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 735 

Qy 8 08 GT GAT AC CCTCCTCTG C AGAAAAAAT TAT GGAT T T GAT GGAGC AGC CAGGTAACACT GT T 8 67 

I I I I 1 M M II I I I I II I I I II I II II I MM II M II II II I I II II II I II 

Db 736 GTGATACGCTCCTCTGCAGAAAA T AT G GAC TTGAAGGAGC AGC CAGGTAACACT ATT 792 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

HI M I I II I M II I II I I II I I I II I I M II II || || || || || | | || I I I II I II II 
Db 793 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 8 52 

Qy 928 C TAT CTCCTCTCT CAACT GT T T C T TT TAAAGAACAT GGAT AC CT T GGT AACTT AT C AGC A 987 

II I I M I I II II I I I || || || M I II I I I I I II II II I I M III II 

Db 853 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 912 

Qy 988 GT GT CAT C C T C AGAAG GAAC AAT T GAAGAAAC T T T AAAT GAAG C T T C T AAAGAGT T G C C A 104 7 

I Ml II II M II I I I II 11 II II I I I I I I II I II 

Db 913 GT AT T AC C CACT GAAG GAACACT T CAAGAAAAT GT CAGT GAAGCT T C T AAAG AGGT CT C A 972 



Qy 


1048 


GAGAGG GCAACAAAT C CAT T T GTAAAT AGAGAT T T AGC AGAAT T T T C AGAAT T AGAAT AT 

1 1 1 1 Mill ii II 1 1 II 1 I I I || I I | || | | | | 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 

GAGAAG GCAAAAAC T C T AC T CAT AGAT AGAGAT T TAAC AGAGT T T T CAGAAT T AGAAT AC 


1107 


Db 


973 


1032 


Qy 


1108 


T CAGAAAT G GGAT CAT CT T T TAAAG GCT C C C CAAAAG GAGAGT CAG C CAT AT T AGT AGAA 

1 1 1 1 1 M 1 1 1 1 1 I I I 1 1 II 11,11,1 I | | || Ml | | | 

T CAGAAAT G GGAT CAT C GT T C AGT GT CT CT C CAAAAG CAGAAT CT G C C GTAAT AGTAGC A 


1167 


Db 


1033 


1092 


Qy 


1168 


AACACTAAGGAAGAAGTAAT T GT GAGGAGTAAA GACAAAGAGGAT T T AGT T T GT AGT 

II 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II I | 1 1 1 II III! 'II 1 

AAT C CTAGGGAAGAAATAAT C GT GAAAAATAAAGAT GAAGAAGAGAAGTTAGTT AGT AAT 


1224 


Db 


1093 


1152 


Qy 


1225 


G CAG C C C T T C AC AGT C C AC AAGAAT C AC C T GTGGGTAAAGAAGAC 

1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 
AACATCCTTCATAATCj^ACAAGAGTTACCTACAGCTCTTACTAAATTGGTTAAAGAGGAT 


1269 


Db 


1153 


1212 


Qy 


1270 


AGAGT T GT GT C T C C AGAAAAGACAAT GGAC AT T T T TAAT GAAAT GC AGAT GT C AGT AGT A 

M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 I | 

GAAGT T GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTT GC AGT GGAA 


1329 


Db 


1213 


1272 


Qy 


1330 


GCACCTGTGAGGGAAGAGTATGCAGACTTTAAGCCATTTGAACAAGCATGGGAAGTGAAA 
II HI 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 1 1 1 II 1 1 II II 1 1 1 II 1 1 1 1 M 1 1 1 1 1 1 1 
GCTCCTATGAGGGAGGAATATGCAGACTTCAAACCATTTGAGCGAGTATGGGAAGTGAAA 


1389 


Db 


1273 


1332 


Qy 


1390 


GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 
1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 | | | | | | | | M 1 || || 
GATA GT AAGGAAGAT AGT GAT AT GTTGGCTGCT GGAGGT AAAAT C GAGAGCAACT T G 


1437 


Db 


1333 


1389 


Qy 


1438 


GAAAGT AAAGT G GACAGAAAAT GCT T GGAAGAT AGC C T G GAGCAAAAAAGT CTT G GGAAG 
M 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 I I | | | | | | | | | || | | | | | | | | Ml | || 
GT^GTAAAGTGGATAAAAAATGTTTTGCAGATAGCCTTGAGCAAACTAATCACGAA^lAA 


1497 


Db 


1390 


1449 


Qy 


1498 


GAT AGT GAAG GC AGAAAT GAGGAT GCTTCTTTCCC C AGT AC C C CAGAAC C T GT GAAGGAC 

1 1 1 1 1 1 1 1 1 M INN III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 I 1 1 I 

GAT AGT GAGAGT AGT AAT GAT GAT ACT T CT T T C C C C AGT AC GC CAGAAGGT AT AAAGGAT 


1557 


Db 


1450 


1509 


Qy 


1558 


AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA C C T CAG C AAC C GAAAG C AC C AC A 

III 1 1 1 1 1 1 M 1 1 II 1 II 1 1 1 || | || | | 1 1 1 1 1 | | | | | | | | || 
C GT T CAG GAGC AT AT AT CAC AT GT GCTCCCTT TAAC C CAG CAG CAAC T GAGAG CAT T G C A 


1614 


Db 


1510 


1569 


Qy 


1615 


GCAAACACTTT C C CT TT GTTAGAAGAT CATACTT CAGAAAATAAAACAGAT GAAAAAAAA 

HUM hi 1 1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 

ACAAACATTT TT CCTTT GTTAGGAGAT C CTACTT CAGAAAATAAGAC CGAT GAAAAAAAA 


1674 


Db 


1570 


1629 


Qy 


1675 


AT AGAAGAAAG GAAG G C C C AAAT T AT AAC AGAGAAG - - — ACT AG C C C C AAAAC GT C AAAT 

M M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | | | | | | | Mill 

AT AGAAGAAAAGAAG G C C C AAAT AGT AAC AGAGAAGAAT ACT AG CAC C AAAAC AT C AAAC 


1731 


Db 


1630 


1689 


Qv 


1732 


HIM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | ill || 

CCTTT T CT T GTAGC AGCACAGGATT CT GAGACAGATTAT GT CACAACAGATAATTTAACA 


17 91 


Db 


1690 


1749 


Qy 


1792 


AAGGTGACTGAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATTTAGTTCAG 
1 1 1 1 1 1 1 1 M M 1 II Ml 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 Mill 1 1 1 1 1 1 1 1 1 1 1 III 
AAG GT GAC T GAGGAAGT C GT G G CAAACAT GC C T GAAGGC CT GAC T C C AGAT T T AGTAC AG 


1851 


Db 


1750 


1809 



1852 GAAGC AT GT GAAAGT GAAC T GAAT GAAG C C ACAGGT ACAAAGAT T GCT T AT GAAACAAAA 1911 

N I I I I I I I I I I I I I I I I I I I I I I I I I II || | | | | | || I I I I I | I | | | | | M M | | 
1810 GAAGC AT GT GAAAGT GAAT T GAAT GAAGT T ACT GGTACAAAGAT T G C T TAT GAAACAAAA 18 69 

1912 GTGGACTTGGTCCAAACATCAGAAGCTATACAAGAATCACTTTACCCCACAGCACAGCTT 1971 

I I I I I I I I I I I I I I I I I I I I I M IN | | | | | | | | | | I I I II I I I I I 

1870 AT GGACTT GGTT CAAACAT C AGAAGTTAT GCAAGAGT CACT CT AT C CT GCAGCACAGCTT 192 9 

1972 T G C C CAT CAT T T GAGGAAGC T GAAGCAAC T C C GT C AC C AGT T T T G C C T GAT AT T GTT AT G 2031 

I I I I I I I I I M I Mill I II II I I I I I I I I I I I I I II I I I I II I I I I I 

1930 T GC C CAT CAT T T GAAGAGT C AGAAGC T ACT C CTT C AC C AGT TT T G C CT GAC AT T GTT AT G 198 9 

2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I I I II I I I I II I I | | | | M I M I I II I I I I I | | M | 

1990 GAAG C AC CAT T GAAT T CT G C AGT T C CT AGT GCT G GT GC T T C C GT GAT ACAGC C C AGC T CA 204 9 

2092 T C C C CACT GGAAGC AC CT C C T C CAGTT AGT T AT GACAGT AT AAAG CTT GAGC CT GAAAAC 2151 

H Ml I I I I I II M I I I I M I I I II I I I I I I I I I I I II I I I I I I I I I 
2050 T CACCATT AGAAG CTT CT T CAGTTAATT ATGAAAGC ATAAAACATGAGCCTGAAAAC 2106 

2152 C C C C C AC CAT AT GAAGAAG C CAT GAAT GTAG CACT AAAAGCTT T GG GAACAAAGGAA 2208 

I I I I I I I M I I I I I I I I I I M II I I M I I I I I II I I I I I I I I I 

2107 C C C C C AC CAT AT GAAGAGGC C AT GAGT GT AT CACTAAAAAAAGT AT C AGGAATAAAGGAA 2166 

2209 GGAATAAAAGAGC CT GAAAGT T T T AAT GCAGC T GT T CAG GAAAC AGAAGCT C CT T AT AT A 22 68 

I HI I N I I I I II I I I I I I I I I I I M I I I | | | | | | | | | | | || | | | | | | M | | | | 
2167 GAAATTAAAGAG C C T GAAAAT AT T AAT GCAG CT CT T CAAGAAAC AGAAGCT C CT T AT AT A 2226 

2269 T C CAT T G C GT GT GAT T TAAT TAAAGAAACAAAGCT C T C CACT GAGC CAAGT C CAGAT T T C 232 8 

II I I I I I M I I I I I I I I M I I I II I I II I I I II II I I I I Ml III I M I I I 
2227 T C T ATT G CAT GT GAT T TAAT TAAAGAAACAAAGCT T T CT G CT GAACC AG CT C C GGAT T T C 2286 

232 9 T C TAAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT CGGTGCCC GAAC AC GCT GAGCTAGT G 2388 

I N I I I II I I I I II I I I I I I I I I || I I I I I I I I | | | | | | M 

2287 T C T GAT TAT T C AGAAAT GGC AAAAGT T GAAC AGC C AGT GC CT GAT CATT C T GAGCTAGT T 2346 

2389 GAGGAT T C CT C AC CT GAAT CT GAAC CAGT T GACTT AT T T AGT GAT GATT C GAT T C C TGAA 2448 

II I I I M I M I I I I I I I I I I II I I I I M II I I M I I I I I I I I I I I I | | || IMM 

2347 GAAGAT T C C T C AC CT GAT T CT GAAC CAGT T GACTT AT T T AGT GAT GATT CAAT AC C T GAC 2406 

2449 GT C C CACAAAC ACAAGAGGAGG CT GT GAT GCT CAT GAAGGAGAGT CT C ACT GA A 2502 

I I I I I I I I I I I I M I M I I I II II I I I II II I I I II II II II II I 
2407 GT T CCACAAAAACAAGAT GAAACT GT GAT GCTT GT GAAAGAAAGT CTCACT GAGACTT CA 2466 

25 03 GT GT CT GAGACAGTAGC C CAGCACAAAGAGGAGAGACTT AGT GC CT CAC CT CAGGAGCTA 2562 

I I M I I III || II I II I I M I I I Ml III | 

24 67 T T T GAGT CAAT GATAGAAT AT GAAAAT AAGGAAAAAC T CAGT GC T T T G C C AC CT GAGGGA 2526 

2563 GGAAAGC CAT AT T T AGAGT CT T TT C AGC C CAAT T T AC AT AGT ACAAAAGAT GC T GCA 2619 

M I I I I I M I M II M I I I M I I II I I II I I I I I | | | 

2527 GGAAAG C CAT AT T T G GAAT CT T TTAAGCT CAGT T TAGATAAC ACAAAAGAT AC C CT GT T A 258 6 

2 620 T C TAAT GAC AT T C C AAC AT T GAC C AAAAAG G AGAAAAT T T C T T T G C AAAT G GAAGAGT T T 2679 

II UN II I I I I I I II I I I I I I || | | | | || I I I I M I I I I Mill III I 

2587 C C T GAT GAAGT T T CAAC AT T GAGCAAAAAG GAGAAAAT T C C T T T GC AGATG GAG GAGC T C 2 64 6 

2680 AAT ACT G CAAT T TAT T CAAAT GAT GACT T AC T T T C T T CT AAGGAAGACAAAAT AAAAGAA 2739 



2 647 AGT AC T G CAGT T TAT T CAAAT GAT GACTT AT T TAT T T C TAAGGAAG C AC AG AT AAGAGAA 2706 

274 0 AGT GAAAC AT T T T C AGAT T CAT C T C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GT TT GT C 27 99 

I I I I I I I I I I I I I I I I I I I I I I I I | | | M Mill || | | || II | | 

2707 ACT GAAAC GT T T T C AG AT T CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C C T ACAT T GAT C 2766 

2800 AGT GCTAAAGAT GAT TC TCC T AAAT T AGC CAAGGAGT AC AC T GAT C T AGAAGT AT C C 2856 

III I I I I I I I II I I I I I I I I I I I I I I I | || | | | | | | | | | | | | M I I M I 

2767 AGTT CTAAAACT GAT T CATTTT CTAAATTAGCCAGGGAATAT ACT GAC CT AGAAGT AT CC 2 82 6 

2857 GACAAAAGT GAAAT T GC TAAT AT C CAAAGC GGG GC AG AT T CAT T GC CT T GC T T AGAATT G 2916 

I M I I I M I I I I I I I I I I I I I M I I II I 

2827 CACAAAAGT GAAAT T GC TAAT G C C C C G GAT GGAGC T GGGT CAT T GC CT T GCACAGAATT G 2 886 

2917 C C CT GT GACCTT T CT T T C AAGAAT AT AT AT C C TAAAG AT GAAG TACATGTTTCA 2 970 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 

2887 C C C CAT GACCTT T CT T T GAAGAAC AT ACAAC C CAAAGTT GAAGAGAAAAT C AGT T T C T C A 2 94 6 

2 971 GAT GAAT T CT C C GAAAAT AG GT C CAGT GT AT C TAAGGCAT C CAT AT C GC CT T CAAAT GT C 3 030 

I I I I I I I M I II I I I I I I I I I I | | | | | M I I II I I I I I I I 

2947 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 3 006 

3031 TCTGCTTT GGAACCT C AGAC AGAAAT GGGC AGC AT AGT T AAAT CCAAAT C ACT T AC GAAA 3 090 

I I I I I I I I I I I I M Jill II I I I I I I II I I I || M || | Ml | | | | 
3007 TCTGCTTTGGC C ACT CAAGC AGAGAT AGAGAGCAT AGT TAAAC CCAAAGTT CT T GT GAAA 3066 

3091 GAAGC AGAGAAAAAAC TTCCTTCT GAC AC AGAGAAAGAGGACAGAT C C CT GT CAGC T GT A 3150 

I I I I I I I I II I I I I I I I I I I I I II I | | I I I I I I I I I II I I I I I I II III || 
3067 GAAGCT GAGAAAAAAC TT C CT T C C GAT AC AGAAAAAGAG GACAGAT C AC CAT CT G CT AT A 3126 

3151 T T GT CAGCAGAGCT GAGTAAAAC T T CAGT T GT T GAC CT C C T CT ACTGGAGAGACAT T AAG 3210 

M I I I I I I M I I I II I I II I I I I II I I I I I I M I I | | | | | I I I I I | | | | | M I I I I I I 
3127 T T T T CAGCAGAGC T GAGTAAAAC T T C AGT T GT T GAC CT C C T GT ACT G GAGAGAC ATTAAG 318 6 

3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I M I II I I I I I I I I I I I I I I I I I I I | | | | | | | || | || | | | || | | | | | | | | | | | | | 
3187 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 324 6 

3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I M I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I || I I I I I I I M 
32 47 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3306 

3331 ATATATAAGGGCGT GAT CCAGGCTAT C CAGAAAT CAGAT GAAGGCCAC CCATT CAGGGCA 3390 

Mill I I I I I I I I I II I I I II I I I I I I I I I I I II || I I | | | | | | | | | | || M | | | | | 
3307 AT ATACAAGGGT GT GAT C CAAGCT AT C CAGAAAT CAGAT GAAGGCCACCCATT CAGGGCA 3366 

3391 TAT T TAGAAT CT GAAGT T GCT AT AT C AGAGGAATT GGT T CAGAAAT ACAGT AATT CT GCT 34 50 

Ml I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
3367 TAT CT GGAAT C T GAAGT T G C TAT AT C T GAG GAGT T GGT T CAGAAGT AC AGTAATT CT GCT 3426 

3451 C T T GGT CAT GT GAAC AG C ACAATAAAAGAAC T GAGGC G GCT T T T CT T AGT T GAT GATT T A 3510 

I I I I I I I II I I I I I I II I I I I I I I Mill II II I || | | || | | M I II I I I I II I 
3427 CTT GGT CATGT GAACT GCACGATAAAGGAACT CAGGC GC CT CTT CTT AGTT GAT GATTTA 348 6 

3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 357 0 
I I I I I I M I I II II I I I II II || M I II I II I I I || | || || I I I II I I I I I I II I I 



Db 



3487 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 354 6 



Qy 3571 AATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTAT 3630 

I M I I I II I I I I I I I I I I I I I II I I I I I II I I II I I I I I I I I I M | | | | | | | | | | | 
Db 3547 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3606 

Qy 3631 GAAC G GC AT C AG GT GC AGAT AGAT CAT TAT C TAG GAC T T G CAAACAAGAGT GT T AAGGAT 3690 

I I I I I I I I I I I I I | | || | || M I I I I I I I I I I I I I I I I | | | | | | | | | | | | M III 

Db 3607 GAAC GGC AT C AG GC AC AGAT AGAT CAT TAT C T AGGACT T GCAAAT AAGAAT GT T AAAGAT 3666 

Qy 3691 G C CAT GGCCAAAAT C CAAGCAAAAAT C C CT GGAT TGAAG C GCAAAGC AGA 374 0 

II HIM I I I I I I I I I I M I I I | | | || | | M I I I I I I I I M I I I I || 

Db 3667 GCTATGGCTAAAATC CAAGCAAAAAT CCCT GGAT TGAAGCGCAAAGCTGA 3716 
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XX 

DR WPI; 2000-224657/19. 

DR P-PSDB; AAY95012, AAY95030. 

XX 

PT New secreted or transmembrane proteins and polynucleotides encoding them, 

PT useful for treating neurodegenerative disorders, autoimmune diseases and 

PT cancer. 
XX 

PS Claim 72; Page 321-322; 357pp; English. 
XX 

CC The invention relates to 40 human secreted proteins (AAY9498 1-Y95020 ) , 

CC and cDNA sequences encoding them (AAA23423-A234 62 ) . The secreted proteins 

CC of the invention include those that are thought to be only partially 

CC secreted, i.e., transmembrane proteins. The proteins of the invention may 

CC exhibit one or more activities selected from the following: cytokine 

CC activity; cell proliferation; differentiation; immune modulation; 

CC haematopoiesis regulation; tissue growth activity; activin/inhibin 

CC activity; chemotactic/chemokinetic activity; haemostatic and thrombolytic 

CC activity; anti-inflammatory activity; and tumour inhibition activity. The 

CC proteins may be administered to patients as vaccines, and the nucleotides 

CC may be used as part of a gene therapy regime. Diseases or conditions that 

CC may be treated using the proteins or nucleotides of the invention include 

CC autoimmune diseases; genetic disorders; haemophilia; cardiovascular 

CC diseases; cancer; bacterial, fungal and viral infections, especially HIV; 

CC multiple sclerosis; rheumatoid arthritis; pulmonary inflammation; 

CC Guillain-Barre syndrome; insulin dependent diabetes mellitus; and 

CC allergic reactions such as asthma and anaemia. They may also be used for 

CC treating wounds, burns, ulcers, osteoporosis, osteoarthritis, periodontal 

CC diseases, Alzheimer 1 s disease, Parkinson's disease, Huntington's disease 

CC and amyotrophic lateral sclerosis (ALS) . Proteins with activin/inhibin 

CC activity may additionally be useful as contraceptives. Nucleic acid 

CC sequences of the invention may be used in chromosome mapping, and as a 

CC source of diagnostic primers and probes. The present sequence represents 

CC cDNA encoding one of the 40 proteins of the invention 

XX 

SQ Sequence 4093 BP; 1213 A; 926 C; 928 G; 1026 T; 0 U; 0 Other; 

Query Match 62.4%; Score 2333.2; DB 3; Length 4093; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 573; Indels 120; Gaps 22 



134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I M I I I I I Ml I I I I I 1 I I I I I I I I I I I I I I I I I I I II I I I II I I I M 
33 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCC7VACCCCCACAACCGCCCGCGGCT 92 

194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I 

93 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 151 

253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

II I I I I I I I I I II I I t I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I 

152 AT GGAAGACCT GGAC CAGT CT C CT CT GGT CTCGTCCTCGGACAGCCCACCCCGGCCG 208 



Qy 



313 



CCGCCCGCCTT C AAGT ACC AGT T C GT GAC GGAGC C C GAG GAC GAGGAGGAC GAG GAG GAG 



372 





Db 



209 



C AG C C C G C GT T CAAGT AC CAGT T C GT GAG GGAG C C C GAGGAC GAGGAG GAAGAAGAG 



265 



Qy 373 GAGGAGGACGAGGAGGAGGACGACGAGGAC CTAGAGGAACT GGAGGT GCT GGAGAGGAAG 432 

I I I I M I I I I I I I I I I II Mill Mill II M I II M II I II M M M I M I II 

Db 266 GAGGAGGAAGAGGAGGACGAGGAC GAAGACCTGGAGGAGCT GGAGGT GCTGGAGAGGAAG 325 
Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

M II I II M M M I II M I M I MM I I I M M I II I II I M I M 

Db 326 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 385 

Qy 487 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

M M I M I I I MM Mill M I M M M I II II M M II M II I Ml 

Db 386 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 445 
Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I II II M I I M II M II II I M I II M I I I I I I II 

Db 446 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 505 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I II II II I II M I I II M II M M M II II II M M I II M I M II II I M M 

Db 506 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 565 
Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

II I M I I I MM II M I M II I I III Ml M I II I M Ml Ml 

Db 566 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 625 
Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

M I M II II M I M Mill M M M M M I I M M M 

Db 626 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 68 5 
Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

II II II II I I II I M II M II II II II II M M II M II M M II I II M II II M I 

Db 686 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 745 

Qy 8 08 GTGATACCCT CCT CTGCAGAAAAAATTATGGATTT GAT GGAGCAGC CAGGTAACACT GTT 867 

M II I II II M M M I II M M I M M I I I II M II M M I M II M M M I I 

Db 74 6 GT GAT AC GCT CCT CT GCAGAAAA TAT GGACTT GAAGGAGCAGCCAGGTAACACTATT 802 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

III I II II M II M I M M II M M II I I M I I M M II II II I M II I II I M M II 

Db 803 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 862 

Qy 928 C TAT CTCCTCTCT C AAC TGTTTCTTT T AAAGAAC AT G GAT AC C T T GGT AAC T TAT C AG C A 987 

M M I M I II II II I I MUM II M II M M II II II II M II M Ml II 

Db 863 CTGTCTCCT CT CT CAGCC GCT T CTT T CAAAGAACAT GAAT AC CT T GGT AAT TT GT CAACA 922 

Qy 988 GT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT TT AAAT GAAGCT T CT AAAGAGT T GCCA 1047 

M I I M I I II II II M M II M II I I I I II M M I M II M II I II 

Db 923 GT ATT AC C C ACT G AAG GAACACTT C AAGAAAAT GT C AGT G AAGCTT CT AAAGAG GT CT C A 982 

Qy 104 8 GAGAGGGCAACAAATCCATTTGTAAATAGAGATTTAGCAGAATTTTCAGAATTAGAATAT 1107 

MM Mill M M I I II M M II I M II MM M M M II II II II M I 

Db 9 83 GAGAAGGCAA-AACTCTACTCATAGATAGAGATTTAACAGAGTTTTCAGAATTAGAATAC 1041 

Qy 1108 T CAGAAAT GGGAT CAT CT TTT AAAGGCT C CCCAAAAGGAGAGT CAGCCAT ATT AGTAGAA 1167 

M I II II M II II M M II I I Ml II II I M III M Ml M II II II I 

Db 1042 T CAGAAAT GGGAT CAT C GT T CAGT GT CT CT C CAAAAGCAGAAT CT GCC GT AAT AGTAG CA 1101 



Qy 


1168 


Db 


1102 


Qy 


1225 


Db 


1162 


Qy 


1270 


Db 


1222 


Qy 


1330 


Db 


1282 


Qy 


1390 


Db 


1342 


Qy 


1438 


Db 


1399 


Qy 


1498 


Db 


1459 


Qy 


1558 


Db 


1519 


Qy 


1615 


Db 


1579 


Qy 


1675 


Db 


1639 


Qy 


1732 


Db 


1699 


Qy 


1792 


Db 


1759 


Qy 


1852 


Db 


1819 


Qy 


1912 


Db 


1879 


Qy 


1972 



AAC ACTAAGGAAGAAGT AAT T GT GAGGAGT AAA GAC AAAGAG GAT T T AGT T T GT AGT 122 4 

II I M II I I I I I 1 I I I I I I I Mill II I I I I I I I I I I I I I I I I 

AAT C CT AG GGAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGTT AGT AAT 1161 
GCAGC C C T T CAC AGT C C ACAAGAAT C AC C T GTGGGTAAAGAAGAC 12 69 

II I I I I I II li II I I I I I I I I I I I I! I I I I I 



AGAGTT GT GT CT C CAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT GT CAGTAGTA 132 9 

I I I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I II I I 

GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGT T GCAGT GGAA 1281 

GCACCT GT GAGGGAAGAGTAT GCAGACTTTAAGC CATTT GAACAAGCAT GGGAAGTGAAA 138 9 
|| Ml I I I I I I I II I I I I I I I I I I I II I I I II I I I I II I I I I I II I I I I I I 
GCT C CTAT GAGGGAGGAAT AT GCAGACTT CAAACCAT T T GAGC GAGT AT GGGAAGT GAAA 1341 

GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I I I II I I I i I i I I I I I I I I I I I I I I I II I ' 

GATA GTAAG GAAGAT AGT GAT AT GTTGGCTGCT GGAG GT AAAAT C GAGAGCAACT T G 1398 

GAAAGT AAAGT GGACAGAAAAT GCTT G GAAGATAGC CT G GAG CAAAAAAGT CTT GGGAAG 1497 

I I I I I I I I M I I I I I I I I I I I I I I I I I M I I IE I I I I I I I III I II 

GAAAGT AAAGT GGATAAAAAAT GTTTT GCAGAT AGC CTT GAGCAAACTAAT CACGAAAAA 1458 

GAT AGT GAAGG C AGAAAT GAGGAT GCTTCTTTCCC CAGT AC C CC AGAAC CT GT GAAGGAC 1557 
I I I I I I I I I II I I I I I III I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
GAT AGT GAGAGT AGT AAT GAT GAT ACT T CT T T C C C CAGT AC G CC AGAAGGT ATAAAGGAT 1518 

AGCTCCAGAGCATATATTACCTGTGCTTCCTTTA C C T C AG C AAC C GAAAG CAC CAC A 1614 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
CGTT CAGGAGCAT AT AT CACAT GT GCT C C CTTTAACCCAGCAGCAACTGAGAGCATT GCA 157 8 

G CAAAC ACT T T C C CT T T GT T AGAAGAT CAT ACT T CAGAAAATAAAAC AGAT GAAAAAAAA 167 4 

MINI III I I I I I I II I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I 

ACAAACAT TTTTCCTTT GT T AG GAGAT C CT ACT T CAGAAAAT AAGAC CGAT GAAAAAAAA 163 8 
AT AGAAGAAAGGAAGGC C CAAATT AT AAC AGAGAAG ACT AGC C C CAAAACGT CAAAT 1731 

I I I I I I I I I I I I I I I I I M I I I I I II I I I I I I I I I I II I I M I I I I I I I I I 

AT AGAAGAAAAGAAG GC C C AAAT AGT AAC AG AGAAGAAT AC TAG CAC C AAAACAT CAAAC 169 8 

CCTTTCCTT GT AGC AGT AC AGGAT T CT GAGGC AGAT TAT GT TACAAC AGAT ACCT TAT C A 1791 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I M 
CCTTTTCTT GTAGCAGCACAGGATTCT GAGACAGATTATGT CACAACAGATAATTTAACA 1758 

AAGGT GACT GAGGCAGC AGT GT CAAAC AT GCCT GAAG GT CT GAC GC C AGAT T T AGT T C AG 1851 
I I I I I I I I I I I I I II III I I I I I I I I I I I I I I I I I I I II I I I I I I I M II III 
AAGGT GAC T GAGGAAGT C GT GGCAAAC AT GCCT GAAGGC C T GAC T C C AGAT T T AGT ACAG 1818 

GAAG CAT GT GAAAGT GAAC T GAAT G AAGC C AC AGGT ACAAAGAT T GCT TAT GAAAC AAAA 1911 
I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I 
GAAGCATGTGAAAGTGAATTGAATGTWVGTTACTGGTACAAAGATTGCTTATGAAACAAAA 187 8 

GT G GAC T T GGT C CAAAC AT CAGAAGCT AT ACAAGAAT CACT T T AC C C C ACAGC ACAGCT T 1971 

I I I I I I I I I I I I I I I I I I I I I II III I II I I I I I I I II II I I I I I I I I II I 
AT GGACTT GGT T CAAAC AT CAGAAGTT AT GCAAGAGT CACT CTAT CCT GCAGC ACAGCT T 1938 

1972 T GC CC AT CAT T T GAG GAAGCT GAAGCAAC T C C GT CAC CAGT T T T GC CT GAT AT T GT TAT G 2031 



Db 


1939 


Qv 


2032 


Db 


1999 


Qv 


2092 


Db 


2059 


Ov 


2152 


Db 


2116 


Ov 


2209 


Db 


2176 


Ov 


2269 


Db 


2236 


Ov 


2329 


Db 


2296 


Ov 


2389 


Db 


2356 


Ov 


2449 


Db 


2416 


Ov 


2503 


Db 


2476 


Ov 


2563 


Db 


2536 


Ov 


2620 


Db 


2596 


Qy 


2680 


Db 


2656 


Qy 


2740 


Db 


2716 



1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 

T GC C CAT CAT T T GAAGAGT C AGAAGCT ACT C CT T CAC C AGTT T T G C CT GAC AT T GT TAT G 19 9 8 

GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 2058 

T C C C CACT GGAAGCAC CT C C T C C AGT T AGT T AT GACAGT AT AAAG CT T GAG C C T GAAAAC 2151 

II Ml | I I I I II II I I II I I I I I I I I II I I I I I I I I I I I I I I I I I II 

T CAC CAT T AG AAG CT T C T T C AGT T AAT TAT GAAAGC AT AAAAC AT GAG C C T GAAAAC 2115 

C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AG CACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I I I I I I I II I I I I II I II I I I I I I I I I I I I Mill I I I I I I I I I I I I 

C C C C CAC CAT AT GAAGAG G C CAT GAGT GT AT C ACT AAAAAAAGT AT C AGGAAT AAAGGAA 2175 
GGAATAAAAGAG C CT GAAAGT T T T AAT GC AGCT GT T CAG GAAAC AGAAGCT C CT TAT AT A 22 68 

I Ml I I II I I II I II I I I M I II I I I I I I I I I I I I M I I I I I I I I I II II I I I I 

GAAAT TAAAGAGC CT GAAAAT AT T AAT G C AGC T CT T CAAGAAAC AGAAGCT C CT T AT AT A 2235 

T C CAT T GC GT GT GAT T T AAT TAAAG AAACAAAGC T CT C CACT GAGC CAAGT C C AGATT T C 2328 

II I I I I I I I I I I I I I I I II I I I I I I I I 1 I I I I I M I I I I III III I I I I I I 

T CT AT T G CAT GT GAT T TAAT TAAAGAAACAAAGCT TT CT GCT GAAC C AGCT C C G GATT T C 22 95 

T CT AAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT C GGT G C C C GAAC AC GC T GAGCT AGT G 238 8 

I I I I I I I I I II I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I 

T CT GAT TAT T C AGAAAT GGCAAAAGT T GAAC AGC CAGT GC CT GAT CAT T CT GAGCT AGT T 2355 

GAG GAT T C CT CAC CT GAAT CT GAAC CAGT T GACT TAT T T AGT GAT GAT T C GAT T C CT GAA 2448 

II I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I 
GAAGAT T C CT C AC CT GAT T CT GAAC CAGT T GAC T T ATTT AGT GAT GAT T CAAT AC CT GAC 2415 

GT CCCACAAACACAAGAGGAGGCT GT GAT GCT CATGAAGGAGAGT CT CACT GA A 2502 

I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

GTT C C ACAAAAACAAGAT GAAACT GT GAT G CT T GT GAAAGAAAGT CT CACT GAGACTT C A 2475 

GT GT CT GAGACAGTAGCCCAGC ACAAAGAGGAGAGACTTAGT GCCT CACCT CAGGAGCT A 2562 

I I III I III I I I I I I I I I I I I I I III Ml I 

TTT GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT GCTTT GCC ACCT GAGGGA 2535 

GGAAAG C CAT AT T T AGAGT CT T T T CAG C C CAATT T AC AT AGT ACAAAAG AT GC TGCA 2 619 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

G GAAAG C CAT AT T T GGAAT C T T T T AAG CT CAGT T TAGATAACACAAAAGAT AC C CT GT T A 2595 

T CT AAT GAC AT T CCAAC AT T GAC C AAAAAG GAGAAAAT T T CT T T GCAAAT GGAAG AGT T T 2 679 

I | I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I II M I I I I I I Ml I 
C CT GAT GAAGT T T CAAC AT T GAG CAAAAAGGAGAAAAT T C CT T T GC AG AT G GAG GAGCT C 2 655 

AAT ACT GCAAT T TAT T C AAAT GAT GAC T T ACT T T CT T CTAAGGAAGACAAAAT AAAAGAA 2739 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I M I I I I I MM 

AGTACTGCAGTTTATTCAAATGATGACTTATTTATTTCTAAGGAAGCACAGATAAGAGAA 2715 
AGT GAAAC AT TTT C AGAT T CAT CT C C GAT T GAGAT AAT AG AT GAAT T T C C CAC GT T T GT C 2799 

I I I II I I II I I I II II I II II I II Mill II II II II I I II II II M M 

AC T GAAAC GT T T T C AGAT T CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C C T AC AT T GAT C 277 5 



2800 AGTGCTAAAGATGATTC T C CT AAAT T AGC CAAGGAGT ACACT GAT C T AGAAGT AT C C 2856 

I I I I II II MUM I II I II II M II I II I I I I I I II II I II II M I I I 



Db 2776 AGT T CT AAAACT GAT T CAT T T T CT AAATT AGC C AGG GAAT AT ACT GAC C T AGAAGT AT C C 2 835 

Qy 2 857 GAC AAAAGT GAAAT T GCTAAT AT C CAAAG C G G G GC AGAT T CAT TGCCTTGCT T AGAAT T G 2916 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I II I M I I I I 

Db 2836 C ACAAAAGT GAAAT T GC TAAT GC C C C G GAT GGAGCT G GGT CAT T G C CT T G C ACAGAAT T G 2 8 95 

Qy 2917 CCCTGT GAC CT T T C T T T CAAGAAT AT AT AT C CT AAAGAT GAAG TACATGTTTCA 2970 

III I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I 

Db 2 8 96 C C C CAT GAC CT T T CT T T GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 2955 

Qy 2 971 GAT GAAT T CT C C GAAAAT AGGT C CAGT GT AT C T AAGGC AT C CAT AT C GC CT T CAAAT GT C 3030 

II I I I I I I I I I I I I I I I I I 111 I I I I I I I I I I I I I I I I II 

Db 2956 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 3015 

Qy 3031 TCTGCTTT GGAAC C T C AGAC AGAAAT G GGC AGCAT AGT T AAAT CCAAAT CACT T AC GAAA 3090 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I 
Db 3016 TCTGCTTT GGC CACT CAAGC AGAGAT AGAGAG CAT AGT T AAAC CCAAAGT T CT T GT GAAA 307 5 

Qy 3091 GAAGC AGAGAAAAAACT T CC T T CT GAC AC AGAGAAAGAGGACAGAT CCCTGT CAGCT GT A 3150 

I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I II III I 

Db 3076 GAAGCT G AGAAAAAACT T CC T T C C GAT AC AGAAAAAGAGGACAGAT C AC CAT C T G CT AT A 3135 

Qy 3151 T T GT CAG C AG AGCT GAGT AAAACT T CAGT T GT T GAC CT C C T C T ACT GGAG AGAC AT T AAG 3210 

II I I I I I I I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I II II I I I I I 

Db 3136 T T T T CAG C AGAGCT GAGT AAAAC T T CAGT T GT T GAC CT C CT GT ACT GGAGAGAC AT T AAG 3195 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 3196 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3255 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I II I I I I I I II I I I II II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 
Db 3256 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3315 

Qy 3331 ATATATAAGGGCGTGATCCAGGCTATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCA 3390 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3316 ATATACAAGGGTGTGATCCAAGCTATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCA 3375 

Qy 3391 TAT T T AGAAT C T GAAGT T GC T AT AT C AGAGGAAT T GGT T CAGAAAT AC AGTAAT T CT GCT 34 50 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II 

Db 3376 TAT C T GGAAT CT GAAGT T GC T AT AT CT GAGGAGT T GGT T C AGAAGT AC AGT AAT T CT GCT 3435 

Qy 3451 CT T G GT CAT GT GAAC AG CACAATAAAAGAAC T GAGG C G G CT TT T CTT AGT T GAT GAT T T A 3510 

II I I I I I I I I I II I I I I I I I I I I I II II I I I I I I II I I I I I I I I II I I I I I I II 

Db 34 3 6 C T T G GT CAT GT GAAC T G CAC GAT AAAGGAACT CAGGC G C CT CT T CTT AGT T GAT GAT T T A 34 95 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I II I I I I I I I II I I I I I I I I I I I 
Db 34 96 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3555 

Qy 3571 AAT GGT C T GAC AC T AC T GAT T T T AGCT CT GAT CT CACT CTT CAGT AT T C C T GT TAT T TAT 3630 

I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I II I I 
Db 3556 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3615 

Qy 3631 GAAC GGCAT C AGGT GCAGATAGAT CATT AT CT AGGACTT GCAAACAAGAGT GTTAAGGAT 3690 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I MM I II I I I Ml 
Db 3616 GAAC GGCAT CAGGCACAGATAGAT CATT AT CT AGGACTT GCAAATAAGAAT GTTAAAGAT 3675 



Qy 3691 GCC AT G G C CAAAAT C CAAG CAAAAAT C C C T G GAT T GAAGC G CAAAGC AGA 3740 

II I I I I I II M I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II 
Db 367 6 G CT AT GGCTAAAAT C CAAG CAAAAAT C C C T G GAT T GAAGC G CAAAG C T GA 3725 



RESULT 7 
ABS70449 

ID ABS70449 standard; cDNA; 4822 BP. 
XX 

AC ABS70449; 
XX 

DT 27-NOV-2002 (first entry) 
XX 

DE Human bone remodelling gene #106. 
XX 

KW Bone remodelling; osteoporosis; human; gene; ss . 
XX 

OS Homo sapiens. 
XX 

PN US6426186-B1. 
XX 

PD 30-JUL-2002. 
XX 

PF 18-JAN-2000; 2 00 0US-004 8 4 97 0 . 
XX 

PR 18-JAN-2000; 2000US-00484970 . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Jones KA, Volkmuth W, Walker MG; 
XX 

DR WPI; 2002-673014/72. 
XX 

PT A combination of polynucleotides which are co-expressed with genes known 

PT to be involved in bone remodeling and osteoporosis are useful in an array 

PT for the diagnosis of bone remodeling and osteoporosis associated 

PT disorders. 
XX 

PS Claim 1; Col 283-288; 206pp; English. 
XX 

CC The invention relates to a combination comprising a number of 

CC substantially purified and isolated polynucleotides which are co- 

CC expressed with genes known to be involved in bone remodelling and 

CC osteoporosis. The invention is used to diagnose disorders associated with 

CC bone remodelling or osteoporosis. ABS70344-ABS70512 represent human bone 

CC remodelling genes of the invention 

XX 

SQ Sequence 4822 BP; 1441 A; 1046 C; 1073 G; 1247 T; 0 U; 15 Other; 

Query Match 62.1%; Score 2323.8; DB 6; Length 4822; 
Best Local Similarity 80.9%; Pred. No. 0; 

Matches 3060; Conservative 0; Mismatches 587; Indels 137; Gaps 25; 

Qy 63 CGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCG 122 

I II I I I I I I II I I I I I II I I I I I II I I I II I I I I I II I II I I I I I I I Ml 

Db 7 8 CNCGGAGGCAGGAGGAGCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCG 137 



Qy 123 GCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAAC 182 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 138 GCTCAGT CGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAAC 182 

Qy 183 CGCCCGCGACTCTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCG 241 

I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 183 CGCCCGCGGCTCTGAGACGCGGCCCCGGNGGCGGCGGCAGCAGCTGCAGCATCATC-TCC 241 

Qy 242 ACCCGCCAGCCATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCC 301 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 242 AC C CT C C AG C CAT G GAAGAC CT G GACC AGT CTCCTCTGGT CTCGTCCTCGGACAGCC 298 

Qy 302 CGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAGG 361 

I MINIM I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I M I 

Db 299 CAC C C C GGC C GCAG CCCGCGTT CAAGT AC C AGT T C GT GAGG GAGC C C GAGGAC GAGGAG- 357 

Qy 3 62 AC GAG GAGGAGGAGGAGGAC GAG GAGGAG GAC GAC GAGGAC CTAGAG GAACT G GAG GT GC 421 

II II I I I I I II II I II I I I I I II I I II I Mill I I II I I M I M I M I 
Db 358 — GAAGAAGAGGAN GAT GAAGAGGAGGAC GAGGAC GAAGAC C T GGAGGAGC T GGAGGT GC 415 

Qy 422 TGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCG 475 

I I I I I I I II I I I I I I I I I I I I I I II II I II II I I I I I I I I I I I I I II 

Db 416 TGGAGAGGAAGCCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCG 475 

Qy 476 CCGCGCCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGC 535 

I I I I I I I I I I I M I I I I I I I I I I I I I M I I I I I I II I I II I I I I 
Db 476 GCGCGCCNNTAATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGC 535 

Qy 536 CGGCCGCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG 589 

II I II I I Mill I III II III I II II I I II II II I I Mill I 

Db 536 CGGCCGCTCCCCCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGT 595 

Qy 590 CGGCGCCCGCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAG 64 6 

I I II II II I I I II II I II I I I II II II I I I II II II II I II I I I II I I 
Db 596 CGACCGTGCCCGCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTG 655 

Qy 647 AGGACGACGAGCCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGG 706 

II I I I II I I I I II I I II M MM II I II II II II III III I II I II II M 

Db 656 AGGACGACGAGCCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGG 715 

Qy 707 CGGAGC CCGCCGCGCCCCCTTCCACGCCGGCCG 739 

I I II I I II II II I II I I I I II II I II II I I 

Db 716 CAGAGCCCGTGTGGANCCCGCCAGCCCCGGCTNCCGCCGCGCCCCCCTCCACCCCGGCCG 775 

Qy 740 CGCCCAAGCGCAGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTG 796 

I II II I I II I I II I I II II I I II II II M I I II I I II I II I II I M II II I II II I I 
Db 776 CGCCCAAGCGCAGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTG 835 

Qy 7 97 CAT C T GAGC CT GT GAT AC CCTCCTCTG C AGAAAAAAT TAT GGAT T T GAT G GAG CAG CCAG 856 

I I I II I I II II I II I II I II I II II II I II II I II II II II II I M I I II I M I 
Db 836 CATCTGAGCCTGTGATACGCTCCTCTGCAGAAAA TAT GGAC T T GAAG GAG CAGCC AG 892 

Qy 857 GTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCT 916 

I I II I II I II II I I I I I II I || || I M II I II II I II II II II I I II II II || II I I 
Db 8 93 GTAACACTATTTCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTT 952 



Qy 917 CTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTA 976 

M I I I I I I I I I I I I I I I || | | M | t I I I I II I I I I I I I I I II I I I I I II I II II 
Db 953 CTCTTCCTTCTCTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTA 1012 

Qy 977 AC T TAT C AG C AGT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT TT AAAT GAAGCT T C T A 1036 

I M Ml I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 1013 AT T T GT CAAC AGT AT T AC C C ACT GAAGGAAC AC T T CAAGAAAAT GT CAGT GAAGCTT C T A 1072 

Qy 1037 AAGAGT T GC CAGAGAGG G CAACAAAT C CAT T T GT AAAT AGAGAT T TAG C AGAAT TT T C AG 1096 

I I I M I I I I I I I I I I I I || || | | I I II I I I I II I I I I I I I I I I I I I I 
Db 1073 AAGAG GT CT C AGAGAAGGCAAAAACT C TACT CAT AGAT AGAGAT T TAAC AGAGT T T T C AG 1132 

Qy 1097 AAT T AGAAT AT T CAGAAAT GG GAT CAT C T TT TAAAGG CT C C C CAAAAGGAGAGT C AGC CA 1156 

I I I I M I I I I I II I I I I I I I I I I I I I I II I I III M I I I I I III II Ml 
Db 1133 AAT T AGAAT AC T CAGAAAT G GG AT CAT C GT T CAGT GT CT CT C CAAAAGC AGAAT C T GC C G 1192 

Qy 1157 TAT T AGT AGAAAACACT AAGGAAGAAGTAAT T GT GAGGAGT AAA GACAAAGAGGATT 1213 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I || | | | || | | 
Db 1193 TAAT AGTAGC AAAT C CT AG G GAAGAAAT AAT C GT GAAAAATAAAGAT GAAGAAGAGAAGT 1252 

Qy 1214 T AGT T T GT AGT G C AG C C C T T C AC AGT C C AC AAGAAT C AC C T GTGG 1258 

I M I I I I I I II I I I I I I I I I I II I I I I I I Ml 

Db 1253 T AGT T AGTAAT AACAT C CT T CAT AAT CAACAAGAGT T AC CT ACAG CT CT T ACTAAAT T GG 1312 

Qy 1259 GTAAAGAAGAC AGAGT T GT GT CT C C AGAAAAGACAAT GGACAT T T T TAAT GAAAT G C AGA 1318 

I I I I I I I I I I I I I I I I I I I I I I I I I Ml II I I I I I I I I I I I M I 
Db 1313 TTAAAGAGGAT GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAG 1372 

Qy 1319 T GT CAGT AGT AG C AC CT GT GAG GGAAGAGT AT GCAGACT T T AAGC C AT T T GAACAAGC AT 1378 

I MM I Ml III I I I II I I II I I I I I I I I M I II I I I I I II I I II II 
Db 1373 T T G CAGT G GAAGCT C C TAT GAGGGAGGAAT AT GCAGACT T CAAAC CAT T T GAGC GAGT AT 1432 

Qy 1379 GGGAAGTGAAAGATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT 14 31 

I I I I I I M I II I I I I I I I I I I I I I I I II I I I II I I I I I I I 

Db 1433 GGGAAGT GAAAGAT A GTAAGGAAGAT AGT GAT AT GTT GGCT GCT GGAGGTAAAAT C G 1489 

Qy 1432 AAT GT GGAAAGT AAAGT G GAC AGAAAAT GCTT G GAAGAT AGC CT G GAGCAAAAAA 1486 

M II I I I I II I I I I I I II I I I I M I II I I I I I I II I I I I I I I I I I 
Db 1490 AGAG CAACTT GGAAAGT AAAGT GG ATAAAAAAT GTT T T GCAGAT AGC C TT GAGCAAAC T A 1549 

Qy 1487 GT CTT GG GAAGGAT AGT GAAGGC AGAAAT GAGGAT GCT T C T TT C C C CAGT AC C CC AGAAC 1546 

M I M I I I I I I I I I M I II I I III I I M I I I I I II I II II I I I I II 
Db 1550 ATCACGAAAAAGAT AGT GAGAGT AGTAAT GAT GATACTTCTTT C CCCAGT ACGCCAGAAG 1609 

Qy 1547 CTGT GAAGGACAGCT CCAGAGCATATATT ACCT GTGCTT C CTT T A CCTCAGCAACCG 1603 

I I M II I III II I I I || I I I I I I II M I II I I I I I I M I I I I I 
Db 1610 GT AT AAAGGAT C GT T CAGGAGC AT AT AT C AC AT GT GCT C C CTT TAAC C C AG C AGC AAC T G 1669 

Qy 1604 AAAG C AC C AC AG CAAAC ACT T T C C C T T T GT T AGAAGAT CAT ACT T C AGAAAAT AAAACAG 1663 

I M I I M MUM II I I I I II II II I I I I I I I I II II II II I II I I II I 
Db 1670 AGAG CAT T GC AACAAAC AT TTTTCCTTTGT TAG GAGAT C C T ACT T C AGAAAAT AAGAC C G 172 9 

Qy 1664 ATG-AAAAAAAAATAGAAGAAAGGAAGGCCCAAATTATAACAGAGAAG ACTAGCCCC 1719 

M I I I M I I I I I I II I I I II I I M I II I I II I I I I I I I II I I II II II II II 
Db 1730 AT GAAAAAAAAAAT AGAAGAAAAGAAG G C C CAAAT AGT AAC AGAGAAGAAT ACT AG C AC C 17 8 9 

Qy 1720 AAAAC GT CAAAT CC-TTTCCTT GT AGC AGT AC AG GAT T CT GAGGCAGAT TAT GT T ACAAC 1778 



Db 1790 AAAACATCAAACCCTTTTACTTGTAGCAGCACAGGATTCTGAGACAGATTATGTCACAAC 1849 

Qy 177 9 AGAT AC CTTAT CAAAGGT GACT GAGGCAGCAGT GTCAAAC AT GCCT GAAGGT CT GACGCC 18 38 

I I I I I M I I I I I I I I I I I I I I I I II III I I I I I I I I I II I I I II Mill II 
Db 1850 AGATAAT TTAACAAAGGT GACT GAGGAAGT C GT GGCAAACAT GCCT GAAGGC CT GACT C C 1909 

Qy 1839 AGAT T T AGT T C AG GAAG CAT GT GAAAGT GAAC T GAAT GAAGC CAC AGGT ACAAAGAT T G C 1898 

I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II II I I I I I I I I || M I I 
Db 1910 AGAT T T AGT AC AGGAAGCAT GT GAAAGT GAAT T GAAT GAAGT T ACT G GT ACAAAGAT T GC 19 69 

Qy 1899 TTAT GAAACAAAAGT GGACTT GGT CCAAACAT CAGAAGCTATACAAGAAT CACTT TACC C 1958 

I I I I I I I I I I I M I I I I I I I I I I I I I I II I I I I I II II I II I I I I I I I I II II 
Db 1970 T TAT GAAACAAAAAT G GACT T GGT T CAAAC AT C AGAAGT T AT GCAAGAGT CAC T C TAT C C 2029 

Qy 1959 C ACAG C AC AGC T T T G C C CAT CAT T T GAGGAAG CT GAAGCAACT C C GT CAC C AGT T T T GC C 2018 

I I I I I I I I I I M I I M I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I 
Db 2030 T G C AG C AC AGCT T T GC C CAT CAT T T GAAGAGT C AGAAG CT AC T C CTT C AC C AGT T T T G C C 208 9 

Qy 2 019 TGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGT 2078 

III I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I 

Db 2090 TGACATTGTTATGGAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGAT 214 9 

Qy 2079 GC AG C C C AGT GT AT C C C CACT GGAAGC AC CT C CT C C AGT T AGT TAT GACAGT ATAAAG C T 2138 

I I I I I I I I III III I I I I I II II I I I I I I I I I I I I II I I I I I I 
Db 2150 ACAG C C C AG C T CAT CAC CAT T AGAAG C T T C T T C AGT T AAT TAT GAAAG C AT AAAAC A 2206 

Qy 2139 T GAG C C T GAAAAC C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AG CAC T AAAAGCTTT 2195 

I I I I I II I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I II I I I I I I I I 
Db 2207 T GAGC CT GAAAAC C C C C CAC CAT AT GAAGAG G C CAT GAGT GT AT C ACTAAAAAAAGT AT C 22 66 

Qy 2196 G GGAAC AAAGGAAGGAATAAAAGAGC C T GAAAGT T T TAAT G C AGCT GT T CAGGAAAC AG A 2255 

I I I I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I MM II I I II II 

Db 2267 AG GAAT AAAGGAAGAAAT T AAAGAGC CT GAAAAT AT TAAT GCAG C T CT T CAAGAAACAGA 2326 

Qy 2256 AG CT C CT TAT AT AT C CAT T GCGT GT GAT T TAAT T AAAGAAAC AAAG CT C T C CACT GAGC C 2315 

I II I II II I I I I I II I I I II I I I I I I I II I I II I II II I I I I II II II I I I I II 
Db 2327 AGCT C CT T AT AT AT C TAT T G CAT GT GAT T TAAT T AAAGAAACAAAGCTT T CT GCT GAAC C 2386 

Qy 2316 AAGT C C AGAT T T C T CT AAT TAT T CAGAAAT AGCAAAAT T C GAGAAGT C GGT G C C CGAAC A 2375 

I III II II II I M I I I I I II II II I I I II I I I I II II I I I I I I II II 
Db 23 8 7 AGCT C C G GAT T T CT CT GAT TAT T CAGAAAT G G CAAAAGTT GAAC AG C C AGT GCCT GAT C A 2446 

Qy 2376 C GCT GAGCT AGT GGAGGAT T CCT C AC CT GAAT CT GAAC C AGT T GACT TAT T T AGT GAT GA 2435 

I I I I II I I I I M II I II I I I I I I I II II I II II M II I I I I I I I I I I II I I I I II 
Db 2447 T T CT GAGC T AGT T GAAGAT T CCT CAC C T GAT T C T GAAC C AGT T GACT TAT T T AGT GAT GA 2506 

Qy 2436 T T CGAT T C C T GAAGT C C CACAAACACAAGAGGAG GCT GT GAT GCT CAT GAAG GAGAGT CT 2495 

III II I I II I M I I I I I I I I I I II I II I I I I I II I II I I I I II II I II 
Db 2507 T T CAAT AC C T GAC GT T C CACAAAAACAAGAT GAAAC T GT GAT GCT T GT GAAAGAAAGT CT 2566 

Qy 2496 CACTGA AGT GT CT GAGACAGTAGCC CAGCACAAAGAGGAGAGACTT AGT GCCT C 2549 

I I I I I I II I III I III II I I I I I I II I I I I 

Db 2567 CACT GAG AC T T CAT T T GAGT CAAT GAT AG AAT AT GAAAAT AAG GAAAAACT C AGT G CT T T 2626 

Qy 2550 AC CT C AG GAG CT AGGAAAG C CAT AT T T AGAGT CT T T T C AGC C CAAT T T ACAT AGT AC AAA 2609 

II I Ml I I I II I II I I II I II II I I I I I I III II II I I III II I II 



Db 2627 GC C AC CT GAGG GAGGAAAGC C AT AT TT G GAAT CT T T TAAGCT C AGT T T AGAT AACACAAA 2686 
Qy 2610 AGATGC T GCAT CTAATGACATT CCAACATT GACCAAAAAGGAGAAAATTT CTTT GCA 2 666 

MM I I M M M M M M M I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2687 AGAT AC C CT GT T AC C T GAT GAAGT T T CAAC AT T GAGCAAAAAG GAGAAAAT T C C T T T GCA 2746 

Qy 2667 AAT G GAAGAGT T TAAT AC T GCAAT T TAT T C AAAT GAT GACT T AC T T T C T T C T AAGGAAGA 2726 

M M I III I I II II I II I I I I I I I I I I I I I I I I I II I || I I I I II I I I I I 
Db 274 7 GAT G GAGGAGC T C AGT ACT G CAGT T TAT T C AAAT GAT GACT TAT T TAT T T C TAAGGAAGC 2 806 

Qy 272 7 CAAAAT AAAAGAAAGT GAAACAT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAATT 2 7 86 

I MM I I I I I I II I I I M I II I II I I I I I I I I I I I I I I || I I | I I I I | || 
Db 28 07 ACAGATAAGAGAAACT GAAACGTTTT CAGATT CAT CTCCAATT GAAAT TAT AGAT GAGTT 2 8 66 

Qy 278 7 TCCCACGTTTGT CAGT GCTAAAGAT GAT TC T CCTAAAT TAGCCAAGGAGTACACT GA 2843 

MUM I II II M I II II I I I I I I II I || I I I I M I II I I I I I I I 
Db 2867 C C C T AC AT T GAT CAGT T CTAAAACT GATT C AT T T T CT AAAT TAG C C AG GGAAT AT ACT GA 2 92 6 

Qy 2844 T CT AGAAGT AT C C GACAAAAGT GAAAT T GCT AAT AT C C AAAGC GGGGC AGAT T CAT T G C C 2 903 

M M II I I I I I I I I I I I I II I I I M I I I I I I I || I I I I I I I I I I I M 

Db 2 927 C CT AGAAGT AT C C CACAAAAGT GAAAT T GCT AAT GC C C C GGAT GGAGC T GGGT CAT T GC C 298 6 

Qy 2904 T T GCT T AGAAT T GC C C T GT GAC CT T T CT T T CAAGAAT AT AT AT C CT AAAGAT GAAG 2 959 

M M I II I II I I I I I I I I I I I I I I I I I M I I I M I M I I I I I II II 
Db 2987 T T GC ACAGAAT T G C C C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAA 3046 

Qy 2 960 — T AC AT GT T T C AGAT GAAT T CT C C GAAAAT AGGT C CAGT GTAT CTAAGGC AT CC AT AT C 3017 

I I I I I I I I I I I II I I MINIMI I I I I I I | | | Ml 

Db 3047 AAT CAGT T T C T C AGAT GACT T T T C T AAAAAT G G GT CT G CT ACAT CAAAGGT G CT C T TAT T 3106 

Qy 3018 GCCTTCAAATGTCTCTGCTTTGGAACCTCAGACAGAAATGGGCAGCATAGTTAAATCCAA 3077 

M I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 3107 G C CT C CAGAT GTTTCTGCTTT GGC C ACT CAAG CAGAGAT AGAGAGC AT AGT T AAAC C CAA 3166 

Qy 3078 AT C AC TT AC GAAAGAAG CAGAGAAAAAACT T C CT T CT GAC ACAGAGAAAGAGGAC AGAT C 3137 

I Ml I I M I I I I I I I I I I I M II I I I I I I I II I I I I I I I I I I I I | I || I || 
Db 3167 AGT T CT T GT GAAAGAAGCT GAGAAAAAACT T C CT T C C GAT ACAGAAAAAGAGGAC AGAT C 322 6 

Qy 3138 C CT GT C AGCT GT AT T GT CAG CAGAG C T GAG - TAAAACT T CAGT T GT T GAC C T C CT CT ACT 3196 

I M III I I I I I I II I I II I M I I I M I II I I I I I I I I I I I I I I I I I I I I I M 
Db 3227 ACCATCTGCTATATTTTCAGCAGAGCTGAGCTAAAACTTCAGTTGTTGACCTCCTGTACT 3286 

Qy 3197 G GAGAGAC AT T AAGAAGACT GGAGT G GT GTT T GGT GC CAG C T TAT TCCTGCTGCT GT C T C 32 56 

M II I I I II I II I I I I I I I I I I I I I I I I I I I || | | | || | | | I I | | | | | | | I | | | || 
Db 3287 GGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCAT 3346 

Qy 3257 TGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGA 3316 

I M I M I II II II I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 
Db 3347 TGACAGTATTCAGCATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGA 3406 

Qy 3317 CT AT C AGCT T TAG GAT AT AT AAGGGC GT GAT C C AGGCT AT C CAGAAAT CAGAT GAAGGC C 3376 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I | || | | | | | | | | | | | | | | || || | 
Db 3407 C CAT CAGCTT TAGGAT AT ACAAGGGT GT GAT CCAAGCT AT C CAGAAAT CAGAT GAAGGC C 3466 

Qy 3377 AC C CAT T C AGG GC AT AT T T AGAAT CT GAAGT T GCT AT AT CAGAGGAATT GGT T CAGAAAT 3436 

I M I I I I I I I I I II I I I I I I I II I I I I I I I I II I I I I I II II I II I I I I I I II I I 
Db 34 67 AC C C ATT CAG GGC AT AT CT GGAAT CT GAAGT T G C TAT AT C T GAGGAGT T G GT T CAGAAGT 352 6 



Qy 3437 AC AGT AATT C TGCTCTTGGT CAT GT GAAC AGC ACAAT AAAAGAAC T GAGGC GGCTTTTCT 34 96 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I Mill I I I I I I 
Db 3527 ACAGTAATTCTGCTCTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCT 3586 

Qy 3497 TAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATG 3556 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I Mill MM 
Db 3587 T AGT T GAT GAT T T AGT T GAT T C T CT GAAGT T T G CAGT GTT GAT GT G GGT AT T T AC CT AT G 3646 

Qy 3557 TTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTA 3616 

I M M II I I I I II II I I I II I II M II II I I I I I I I I I I II M I II I II I II II I 
Db 3647 TT GGT GCCTTGTTTAAT GGT CTGACACTACTGATTTTGGCTCTCATTTCACTCTT CAGT G 3706 

Qy 3617 T T C CT GT TAT T TAT GAAC GGCAT CAGGT GCAGATAGAT CAT TAT CT AG GAC TT GCAAAC A 3676 

I II II I I I II M I I I I I I I I I I I I M I I II M II I II II II I I I I I M M I I I I I I I 

Db 3707 T T C CT GT T AT T TAT GAAC G GC AT CAGGCAC AGAT AGAT CAT TAT C TAG GACTT G CAAAT A 3766 

Qy 3677 AGAGT GT TAAGGAT G C CAT GGC CAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC GCAAAG 3736 

I II MUM Mill II I II I II I I I I I I I I II I M I I I I I I I I I I I I I II I I I M I 
Db 3767 AGAAT GT T AAAGAT GCT AT GGCT AAAAT C CAAG CAAAAAT C C C T GGAT T GAAGC G CAAAG 3826 
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PS Claim 1; SEQ ID NO 124; 339pp + Sequence Listing; English. 
XX 

CC This invention describes a novel disease detection and treatment molecule 

CC polypeptide (MDDT) which has anti-inflammatory, immunosuppressive, 

CC osteopathic, cytostatic, anti-HIV, haemostatic, nephrotropic, 

CC antianaemic, antipsoriatic and hepatotropic activity. The polynucleotides 

CC and the polypeptides of the invention can be used for gene therapy, 

CC protein replacement therapy and are useful for treating a variety of 

CC diseases or conditions. These polypeptides or polynucleotides are 

CC particularly useful for diagnosing, treating or preventing cell 

CC proliferative disorders (e.g. cancers including adenocarcinoma, 

CC leukaemia, lymphoma, melanoma, myeloma or sarcoma), anaemia, Crohn's 

CC disease, acquired immunodeficiency syndrome (AIDS), Goodpasture's 

CC syndromes, inflammation, osteoporosis, thrombocytopaenia, psoriasis or 

CC hepatitis. ABX34440-ABX34835 encode the MDDT polypeptides represented in 

CC ABU11450-ABU11845, described in the disclosure of the invention. NOTE: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic format from WIPO at 

CC ftp . wipo . int/pub/published_pct__sequences 

XX 

SQ Sequence 4698 BP; 1410 A; 1028 C; 1022 G; 1238 T; 0 U; 0 Other; 

Query Match 61.4%; Score 2297.4; DB 7; Length 4698; 

Best Local Similarity 80.7%; Pred. No. 0; 

Matches 2996; Conservative 0; Mismatches 596; Indels 121; Gaps 22 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACT^CCGCCCGCGACT 193 

1 I I I I I I I I I I 1 I I I I I I I i I I I I I I I I I I I I I I I I I I I I I | | | I I 
Db 23 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGCTC 82 

Qy 194 CTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCA 2 53 

I I I I I I I I I I I I I I II I I I I I I I I I I I | I | | I I I I I I I I I I I I 

Db 8 3 CTGAGACGCGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCCA 141 



Qy 254 TGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTC 313 



Db 142 TGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCGC 198 

Qy 314 CGCCCGCCTT CAAGT ACCAGT T C GT G AC G GAGC C C GAGGAC GAG GAGGAC GAGGAGGAG G 373 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II II I I 
Db 199 AGC C C GC GT T CAAGT AC CAGT T C GT GAG G G AGC C C GAGGAC GAG GAG GAAGAAGAGG 255 

Qy 374 AGGAGGACGAGGAGGAGGAC GAC GAGGAC CTAGAGGAACT GGAGGT GCT GGAGAGGAAGC 433 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 

Db 256 AG GAG GAAGAG GAGGAC GAGGAC GAAGAC CT GGAG GAGC T GGAG GT GCT GGAGAG GAAG C 315 

Qy 434 CCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTGC 4 87 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 316 CCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTGA 375 

Qy 488 TGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCC 547 

I I I I I I I I I I I I I I I I I I I II I I I I I I || || | | | | | | || | | | | | | M 
Db 37 6 TGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTTCCC 435 

Qy 548 CTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCCG 598 

I I I I I I I I M I M I I I I I I I I I I I I I I I I I | | | Mill 

Db 43 6 CCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCCG 495 

Qy 599 CGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGC 658 

M I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I II I I II I I 
Db 496 CGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAGC 555 

Qy 659 CTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I I I I I I I M Mill II II Ml III II II I I I I III III 
Db 55 6 CTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTGT 615 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGCA 751 

I M II II II I I I I I I I II I I II I I II II I I I I I M I M 
Db 616 GGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGCA 675 

Qy 752 GGGGCTCC GG CT C AGT GGAT GAGAC C C — TTTTTGCTCTTCCTGCTGCATCTGAGC 8 05 

I I I I I I I I I II I I I I I I II II I I I I I II I I I II Ill 

Db 676 GGGGCTCCTCGGGCTCAGAT GGAT GAGACCCATTTTT GCT CTTACCT GCT GCATCT GAGC 735 

Qy 8 06 CT GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGAT T T GAT GGAGC AG C C AGGTAACACT G 8 65 

I M I I I M I I I I II I I I III MIMI I I II I I M II I I I II I II I II I I 

Db 736 CTGTGATACGCTCCTC — AT GCAGAAAATAT GGACTT GAAGGAGCAGCCAGGTAACACT A 7 93 

Qy 8 66 TTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTT 925 

I M I I I I I I I I I I I I || I II II II I I I I I II I I I I I II I I I II I I I II I I I I I I I I M 
Db 7 94 TTTCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTT 853 

Qy 92 6 CTCTATCTCCTCTCTCAACTGTTTCTTTTAAAG7VACATGGATACCTTGGTAACTTATCAG 98 5 

I I I I I I I I I I I I I I I I I I MIMI I I II I II I II I II I I I || I I II II III 
Db 854 CTCTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAA 913 

Qy 986 CAGT GT CAT C CT C AGAAG GAAC AAT T GAAG- AAACT T T AAAT GAAGC T T CT AAAGAGT T G 104 4 

I I I I I I M I I I I I I I I M II III I II I I I I II II I II II I II II I I 
Db 914 CAGT ATT ACCC ACT GAAGGAACACTTCAAGAAAAATGTC AGT GAAGCTTCTAAAGAGGTC 973 

Qy 1045 CCAGAGAGGGCAACAAAT CCATTT GTAAAT AGAGATTT AGCAGAATTTT CAGAATT AGAA 1104 

I I I I I I I I I I I II II I I II I I || II II I II I II I I I II II I II II I I II 



Db 974 T C AG AGAAG G C AAAAAC T C T AC T CAT AG AT AG AG AT T T AAC AG AG T T T T C AGAAT T AGAA 1033 

Qy 1105 TAT T CAGAAAT GGGAT CAT CT T T TAAAG G C T C C C CAAAAGGAGAGT C AGC C AT ATT AGT A 1164 

M I M I I I I I I I I I M I I I II I I III I I I I I I I Ml || Ml M Mill 
Db 1034 T AC T CAGAAAT GGGAT CAT C GT T C AGT GT C T C T C C AAAAG CAGAAT C T GC C GT AAT AGT A 1093 

Qy 1165 GAAAAC ACT AAGGAAGAAGT AAT T GT GAG GAGTAAA GACAAAGAGGATTTAGT TT GT 1221 

I I I I I I I I I M II I I I I I I I I I I I I I I II I I II I I II I I I | I I 

Db 1094 GC AAAT C CT AG GGAAGAAAT AAT C GT GAAAAATAAAGAT GAAGAAGAGAAGT TAGT T AGT 1153 

Qy 1222 AGT G C AGCC CT T CAC AGT C C AC AAGAAT C AC C T GT GGGT AAAGAA 1266 

M I I I I I I I I I I I I M I I I II I I I I I I I I I I 

Db 1154 AAT AAC AT C CT T CAT AAT CAACAAGAGT T AC C TACAGCT CT TACT AAAT T G GTTAAAGAG 1213 

Qy 1267 GACAGAGTT GT GT CT CCAGAAAAGACAAT GGACATTTTTAAT GAAAT G CAGAT GT CAGT A 1326 

II I I I I i I II I I I I I I I I I III I I I I I I I I I I I I I I I I I | | | | 

Db 1214 GAT GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTT GCAGT G 1273 

Qy 1327 GTAGCAC CTGT GAGGGAAGAGT AT GCAGACTT TAAGCCATTT GAACAAGCAT GGGAAGT G 1386 

I M I I M I I I M I I II I I I I I I I I I I I II I I I I I I M | M I I I I I I I I I I 
Db 1274 GAAGCT CCTAT GAGGGAGGAAT AT GCAGACTT CAAACCATTT GAGCGAGT AT GGGAAGT G 1333 

Qy 1387 AAAGATACTT AT GAGGGAAGTAGGGAT GT GCT GGCT GCTAGAGCT AAT 14 34 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I | | 

Db 1334 AAAGATA GTAAGGAAGAT AGT GAT AT GT TGGCTGCTG GAGGTAAAAT C GAGAGCAAC 1390 

Qy 1435 GT GGAAAGTAAAGT G GAC AGAAAAT GCT T GGAAGATAGC C T GGAGCAAAAAAGT CT T G GG 14 94 

I M I I I II I I I II I I I I I I I I I I || | | | | || | | M I I I I I I I I || I 

Db 1391 TT GGAAAGTAAAGT GGATAAAAAAT GTT TT GCAGATAGCCTT GAGCAAACTAAT CACGAA 1450 

Qy 14 95 AAGGAT AGT GAAG GC AGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAG 1554 

I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I 
Db 1451 AAAGAT AGT GAGAGT AGTAAT GAT GAT ACTT C T TT C C C CAGT AC G C C AGAAGGT ATAAAG 1510 

Qy 1555 GACAGCTCCAGAGCATATATTACCTGTGCTTCCTTTA C C T C AG C AAC C GAAAG CAC C 1611 

I I Ml I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I M I 
Db 1511 GAT C GT T C AG GAG CAT AT AT C ACAT GTGCTCCCTT TAACC CAG C AGCAAC T GAGAGCATT 1570 

Qy 1612 AC AGCAAAC AC TTTCCCTTT GT T AGAAGAT CAT ACT T C AGAAAAT AAAACAGAT GAAAAA 1671 

II I I I I I 1 Ml M I I I I I I I I I I I I I II I I I I I I I I I I I I I II I M I I I I I I 

Db 1571 GCAACAAACAT TTTTCCTTTGT TAGGAG AT C C T ACTT C AGAAAAT AAGAC C GAT GAAAAA 1630 

Qy 1672 AAAATAGAAGAAAGGAAGGC CCAAATTATAACAGAGAAG AC TAG C C C C AAAAC GT C A 1728 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I III 
Db 1631 AAAAT AGAAGAAAAGAAG G C C CAAAT AGT AACAGAGAAGAAT AC T AGCAC CAAAACAT C A 1690 

Qy 1729 AAT CCTTTCCT T GT AG C AGT AC AGG AT T C T GAG G C AGAT TAT GT T ACAACAGAT AC CT T A 17 8 8 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I III 
Db 1691 AAC CCTTTTCTT GT AG CAGC AC AGGAT T CT GAGAC AGAT TAT GT CAC AAC AGAT AAT T T A 1750 

Qy 17 8 9 T CAAAG GT GACT GAG GC AG C AGT GT CAAACAT GC C T GAAG GT CT GAC G C CAGAT T TAGT T 184 8 

I I I I I I I I I I I I I I I II I M I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1751 ACAAAGGT GAC T GAGGAAGT C GT GG CAAACAT G C CT GAAGGC CT GACT C CAGAT T TAGT A 1810 

Qy 18 49 CAGGAAGCAT GTGAAAGT GAACTGAAT GAAGCCACAGGTACAAAGATTGCTTAT GAAACA 190 8 

I I I I I I I I M | || | | | | | | | | | | | | | | | | | | | | | | | | 

Db 1811 CAG GAAG CAT GT GAAAGT GAAT T GAAT GAAGT T ACT GGT ACAAAGAT T G C T TAT GAAACA 187 0 



Qy 1909 AAAGT GGACTT GGT C CAAACAT CAGAAGCTAT ACAAGAAT CACTTT AC CC CACAGCAC AG 1968 

IN I II I I I I I I I I I I I I Ml Mil I M II I I I I | M I 

Db 1871 AAAAT G GAC TT G GT T CAAACAT CAGAAGT TAT GCAAGAGT CAC T CT AT C C T G C AG CAC AG 1930 

Qy 1969 CT T T G C C CAT CAT T T GAG GAAGCT GAAGCAACT C C GT CAC C AGTT T T G C C T GAT ATT GT T 2028 

I I M I M I I I I I M I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 1931 CTTTGCCCATCATTTGAAGAGTCAGAAGCTACTCCTTCACCAGTTTTGCCTGACATTGTT 1990 

Qy 2029 ATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGT 2 088 

I I I M I I I I I I I M I I I I I I I I I I I I I I I I | || | | | | | | | | | || | | || 

Db 1991 AT GGAAG CAC CAT T GAAT TC T G CAGT T CC T AGT GCTGGTGCTTCC GT GAT AC AGC C C AGC 2050 

Qy 208 9 GT AT C C C CAC T G GAAG CAC C T C CT C CAGT T AGT TAT GAC AGT AT AAAGCT T GAGC C T GAA 214 8 

Ml Ml I II I I II II 1111(1 I I I I I I II I I I I I I I I I II I II I I 
Db 2051 T CAT CAC CAT T AGAAG CT T CT T CAGT T AAT TAT GAAAGC ATAAAAC AT GAGCC T GAA 2107 

Qy 214 9 AAC C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AGCAC T AAAAGCTTTGGGAACAAAG 2205 

I i I I I I I M || I || I | | | | | | | | | 

Db 210 8 AACC C C C CAC CAT AT GAAGAGGC CAT GAGT GT AT CACT AAAAAAAGT AT CAG GAATAAAG 2167 

Qy 2206 GAAGGAAT AAAAGAG C C T GAAAGT T T T AAT GC AG CT GT T C AGGAAACAGAAG CT C CT T AT 2265 

I I I I Ml I I I I I M I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I 

Db 2168 GAAGAAAT TAAAGAGC CT GAAAAT AT T AAT GC AG CT CT T CAAGAAACAGAAG CT C C TT AT 2227 

Qy 2266 AT AT C CAT T G C GT GT GAT T T AAT T AAAGAAACAAAGCT CT C CACT GAGC CAAGT C CAGAT 2325 

MMI Mill I I I I I II I I I II I I I I I I | || | | | | | M | | | | | M | | | Ml 
Db 222 8 AT AT CTAT T G CAT GT GAT TT AAT TAAAGAAACAAAG CTTTCTGCT GAAC C AGC T C C GGAT 2287 

Qy 232 6 TT CT CTAAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT C GGT G C C C GAAC AC GCT GAGCT A 23 85 

M M II I I I I I I I I I I I I I I I I | || | | | | | | | | | || | | | | | | | | | || | 
Db 22 8 8 T T C T CT GAT TAT T CAGAAAT GGCAAAAGT T GAAC AGC CAGT GC C T GAT CAT T CT GAGC T A 2347 

Qy 2386 GT GGAGGAT T C C T CAC C T GAAT CT GAAC C AGTT G ACT TAT TT AGT GAT GAT T C GAT T C CT 244 5 

M M I I I I I I II I I I I I I I || I | | | | M M I I I I I I I I I I I I I I I I I I I I || I | | 
Db 234 8 GTT GAAGATT C CT CACCT GAT T CT GAAC CAGTT GACTTATTTAGT GAT GATT CAATAC CT 24 07 

Qy 24 46 GAAGT C C C ACAAAC ACAAGAG GAG GCT GT GAT G CT CAT GAAGGAGAGT CT C ACT GA 2501 

M II II II II I I || I I I I I I I I I I I M M I I I I M II I I I I I I I I I 
Db 2408 GAC GT T C CACAAAAACAAGAT GAAAC T GT GATGC T T GT GAAAGAAAGT C T CACT GAGACT 2467 

Qy 2502 AGT GTCT GAGACAGT AGCC CAGCACAAAGAGGAGAGACTTAGT GC CT CACCT CAGGAG 2559 

M I Ml I III I I I I I I I I I I I I I I Ml III 

Db 24 68 T CAT T T GAGT CAAT GAT AGAAT AT GAAAAT AAG GAAAAACT CAGT G C T T T G C C AC CT GAG 2527 

Qy 2560 CT AG GAAAGC C AT ATTT AGAGT C T TTT C AG C C CAAT T T AC AT AGT ACAAAAGAT G C T 2616 

M M M I I M I I II I II II II I I III II I I I I III II I II II I I I 
Db 2528 GGAGGAAAG C C AT AT TT G GAAT CT TT T AAGCT C AGT TT AGATAACACAAAAGAT AC C CT G 2587 

Qy 2 617 G CAT CTAAT GAC AT T C CAAC AT T GAC CAAAAAG GAGAAAAT T T CT T T GCAAAT GGAAGAG 2676 

I I Mill II I I I II I II II I I I I || M I II I 

Db 2588 TT AC C T GAT GAAGT T T CAACAT T GAG C AAAAAGGAGAAAAT T CC T T T GC AGAT G G AGGAG 264 7 

Qy 2 677 T T T AAT ACT GCAAT T TAT T CAAAT GAT GACT T ACT T T C T T C T AAG GAAGACAAAAT AAAA 2736 

I I I I I II I I II II II II II II I II I I I I I || || | || I 

Db 2648 CT CAGT AC T G CAGT T TAT T CAAAT GAT GAC T T ATT TAT T T CTAAGGAAGC AC AGAT AAGA 27 07 



2737 GAAAGT GAAACAT T T T CAGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GTT T 27 96 

I I I I I I I I I I I I II I I I I I I I I II I I I I I II I II I I I I I I M || || || | | 
2708 GAAACT GAAACGTTTT CAGATT CAT CT C CAATT GAAAT TAT AGAT GAGTT CC CTACATT G 27 67 

2797 GT CAGT GCTAAAGAT GATT C T C CT AAAT TAG C CAAG GAGT AC ACT GAT CT AGAAGT A 2853 

I M I I him I I I I I I I I I I II I II I I I I III II Mill II I I I II I I 
2768 AT CAGT T CTAAAAC T GAT T CAT T T T CT AAAT TAG C C AG GGAAT AT ACT GAC CT AGAAGT A 2827 

2 854 T C C GACAAAAGT GAAAT T G C T AAT AT C CAAAGC GG G GC AGAT T CAT T G C C TT GCT T AGAA 2913 
I II N I I I I I I I I I M I I M I I I M Mill I I M II II I I M I II I 

2 828 T C C CACAAAAGT GAAAT T G C TAAT GC C C C GGAT GGAGC T GG GT CAT T G C CT T GCAC AGAA 2 8 87 

2914 TTGCCCTGT GAC CTT T C T T T CAAGAAT AT AT AT C CTAAAGAT GAAG T AC AT GTT 2967 

I I I I I I I I I I M I I M I I I M I I II I I I I II || I || || | || 

288 8 T T G C C C CAT GAC CTTTCTTT GAAG AAC AT AC AAC C C AAAG T T G AAG AG AAAAT CAGT T T C 2947 

2968 T CAGAT GAAT T CT C C GAAAAT AGGT C C AGT GTAT CTAAG G CAT C CAT AT C GC CT T CAAAT 3027 

I I I I I N I N M M I I I M I I I II I II II | I I I I I I 

2948 T CAGAT GACTTTTCTAAAAATGGGTCTGCTACATCAAAGGT GCT CTT ATT GCCTCCAGAT 3007 

3 02 8 GT CT CT GCTTT GGAAC CTCAGACAGAAAT GGGCAGCATAGTTAAAT C CAAAT CACTTACG 3087 

'I I I I I M I I M II I I I II I II I I I I II I I I II I I M M I III I 
3008 GT TTCTGCTTTGGC C ACT CAAGC AGAGAT AGAGAGCAT AGT TAAAC C CAAAGT T CTT GT G 3067 

308 8 AAAGAAGCAGAGAAAAAACT T C CT T CT GAC AC AGAGAAAGAGGAC AGAT C C CT GT CAG C T 3147 

I I I I I I I I I I M II II II I II I I I I II I I II I I M I I II I II II I I I II Ml 
3 068 AAAGAAGCT GAGAAAAAACT T C CT T C C GAT AC AGAAAAAGAGGAC AGAT C AC CAT CT GCT 3127 

314 8 GTAT T GT CAGCAGAG C T GAGTAAAACT T CAGT T GT TGAC CT C C T C T AC T GGAGAGAC ATT 3207 

I M I II I I II I II II II I || M M I I II I II I II II M I II I I II II II II I II M I 
312 8 AT AT T T T CAGCAGAGCT GAGTAAAACT T CAGT T GT T GAC CT CC T GT AC T GGAGAGAC ATT 3187 

320 8 AAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTC 32 67 

N I I I I I I I I I I I I I I I I I I I I I II II I I I I II II I I I II II I M I II II I I III 
318 8 AAGAAGAC T GGAGT GGT GTT T G GT G C C AGC CT AT TCCTGCTGCTTT CAT T GAC AGTATT C 3247 

3268 AGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTT 3327 

M I I I I I I II Mill I I I II II I II II II II II I I I I I || | Mill II I II I II I 
3248 AGCATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTT 3307 

3328 AG GAT AT AT AAGGGC GT GAT C CAG GC TAT C C AGAAAT CAGAT GAAGG C C AC C CAT T C AGG 3387 

I I I I I I I I I I I I I M II I II I II M II II I I II I I I I II I I II || II II II II II II 
3308 AGGAT ATACAAGGGT GT GAT CCAAGCT AT C CAGAAAT CAGAT GAAGGCCAC CCATT CAGG 3367 

338 8 G CAT AT T T AGAAT C T GAAGT T GCT AT AT C AGAGGAATT GGT T CAGAAAT AC AG TAATT CT 3447 

I I I I I I I I M M I M M I M I I II I II Mill I I I I II I I II I I I II I II I II M 
3368 G CAT AT CT G GAAT CT GAAGT T GCT AT AT C T GAGGAGT T G GT T CAGAAGT AC AGT AAT T CT 3427 

3448 GCTCTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGAT 3507 

I I I I I I I I I I I I I I I I I I Ml I M II I I II II 

3428 GCTCTTGGT CAT GT GAAC T G C AC GAT AAAGGAAC T C AG GC GC CT C T T C T T AGT T GAT GAT 3487 

35 08 TTAGTT GATT CCCTGAAGTTTGCAGT GTT GAT GTGGGTGTTTACTTAT GTT GGT GCCTTG 35 67 

I I I I I I I I I I I I I I I M I I I I I I M II I II I I II I I || I M II I II I I I II 

34 88 TTAGTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTG 354 7 

3568 TTCAAT GGT CTGACACTACTGATTTTAGCTCT GAT CTCACT CTT CAGT ATT CCTGTTATT 3627 



Db 


3548 


II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MMI 1 1 1 1 II 1 1 1 1 M 1 II 1 1 1 1 | | 1 1 I 
TTTAATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATT 


3607 


Qy 


3628 


TAT GAAC G G CAT C AG GT G CAGAT AGAT CAT TAT CTAGGACT T GCAAACAAGAGT GT T AAG 

1 1 1 1 I 1 1 1 1 1 1 ! 1 1 1 1 1 I 1 1 1 1 1 | 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 I | | 1 | 111! 111,11 

1 1 1 1 1 1 1 1 1 1 M 1 M I || | | | If | If II II M 1 1 1 1 1 1 1 1 1 1 1 1 | 1 | | | 1 1 | | | | 
TAT GAAC G G CAT C AG G C ACAGAT AGAT CAT TAT C TAGGACT T G CAAATAAGAAT GT T AAA 


3687 


Db 


3608 


3667 


Qy 


3688 


GATGCCATGGCCAAAATCCT^GCAAAAATCCCTGGATTGAAGCGCAAAGCAGA 374 0 

1 1 1 1 II 1 1 1 M 1 1 1 II 1 1 1 1 1 I I I I M II 

GAT GCT AT GGC T AAAAT C CAAG CAAAAAT C C CT G GAT T GAAGC GCAAAG CT G A 3720 




Db 


3668 





MAGI protein; neuroendocrine-specif ic protein; neuropathy; human; 
spinal injury; neuronal degeneration; neuromuscular disorder; cancer; 
psychiatric disorder; developmental disorder; inflammatory disorders- 
stroke; cytostatic; cerebroprotective ; neuroprotective; ds . 



RESULT 9 
AAZ56886 

ID AAZ56886 standard; DNA; 3579 BP. 
XX 

AC AAZ56886; 
XX 

DT 25-APR-2000 (first entry) 
XX 

DE Human MAGI polypeptide encoding DNA 
XX 
KW 
KW 
KW 
KW 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 
FT CDS 1. .3579 

FT /*tag= a 

FT /product= "MAGI polypeptide" 

XX 

PN WO200005364-A1. 
XX 

PD 03-FEB-2000. 
XX 

PF 21-JUL-1999; 99WO-GB002360 . 
XX 

PR 22-JUL-1998; 98GB-00016024 . 
PR 19-JUL-1999; 99GB-00016898 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 

PI Michalovich D, Prinjha RK; 
XX 

DR WPI; 2000-182693/16. 
DR P-PSDB; AAY56967. 
XX 

PT Novel polypeptides related to neuroendocrine-specif ic proteins and 
PT polynucleotides useful for diagnosis of various diseases and for 
PT treatment of cancer and neurological disorders. 
XX 

PS Claim 5; Page 19-20; 35pp; English. 
XX 

CC The invention relates to human MAGI protein, which is similar to 



CC neuroendocrine-specific protein. The MAGI protein can be expressed by 

CC standard recombinant methodology. The MAGI polypeptides, polynucleotides 

CC and antibodies are useful for treating diseases, including neuropathies, 

CC spinal injury, neuronal degeneration, neuromuscular disorders, 

CC psychiatric disorders and developmental disorders, cancer, stroke and 

CC inflammatory disorders. The polynucleoitde is also useful for chromosome 

CC localization and for tissue expression studies. The present sequence 

CC represents a DNA encoding the human MAGI protein 

vv 

Sequence 3579 BP; 1074 A; 803 C; 812 G; 890 T; 0 U; 0 Other; 



XX 
SQ 



Query Match 61.2%; Score 2289.2; DB 3; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19 

QY 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I M I I I I I I I I I I I I I I I I | | | | | | || | | | | | | | M || | | | 

Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAG C CC GAGGAC GAGGAGGAC GAGGAGGAG 372 

I Ml Mill I I I I I I Ml | | | || M | | | | M II III 

Db 58 C AGC C C G C GT T CAAGT AC C AGTT C GT GAG GGAGC C C GAGGAC GAGGAG GAAGAAGAG 114 

Qy 373 GAGGAGGAC GAGGAGGAGGACGACGAGGACCTAGAGGAACT GGAGGTGCT GGAGAGGAAG 432 

N MINI II Mill Mill Mill I J J I f I J J I I t I f I I t I t I I I 

Db H5 GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCT GGAGGTGCT GGAGAGGAAG 174 
Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

M I I I II MUM I I II I I || || M II I II II II I II 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I M II I I I I || | | | | || || I II II I I I I II II M M I I II 

Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I II II I II I I I II I I I I I I I || I II I I I I I I I I II 

Db 2 95 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I I I I II I I I I I || I I || I M I II I I II I I I II I I I I II I I I I I II I I 
Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I II I I I II II II || I | | || || Ml III II I II II I Ml III 
Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I I I I I I I I I I I II I II I II M M I I I II I I M 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I I I I I I I I M I I II I I I II II I I M I I I I I II I I II II II II I II I I II 

Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 



Qy 



8 08 GT GAT AC CCTCCTCT GC AGAAAAAAT TAT G G AT TT GAT GGAGC AGC C AG GT AACAC T GT T 8 67 
I I I I I I I II I M M M M I M I II II M II I I I I I II I I II I II II I I M I II 



Db 595 GTGATACGCTCCTCTGCAGAAAA TATGGACTT GAAGGAGCAGC CAGGTAACACTATT 651 

QY 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

HI I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I | M | | | | || | | | | | | | M 
Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

QY 92 8 C T AT CT C CT C T CT CAACT GT T T CT T T TAAAGAACAT G GAT AC CT T G GT AACT T AT C AG C A 987 

H M I II I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I M III II 
Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 98 8 GT GTCAT C CT CAGAAGGAACAATT GAAGAAACTT TAAAT GAAGCT TCTAAAGAGTTGCCA 1047 

II I I II I I I I I I I I II II I I I I I I I I I I I I I I I I I I I II I I II I II 
Db 772 GT AT T AC C C ACT GAAG GAACACT T CAAGAAAAT GT CAGT GAAGC T T C T AAAGAGGTCT C A 831 

Qy 104 8 GAGAGGGCAACAAAT C CATTT GTAAAT AGAGATTTAGCAGAATTTTCAGAATTAGAAT AT 1107 

I I I I I I I I I M M I I II I I I I I I I I I I I I I I I I I I I I I I I I | | | | | || | 
Db 832 GAGAAG GC AAAAAC T CT AC T C AT AGAT AGAGAT T T AACAGAGT T T T C AGAAT TAGAAT AC 8 91 

Qy 1108 T CAGAAAT GGGAT CAT CTTTTAAAGGCT C C CCAAAAGGAGAGT CAGCCATATTAGTAGAA 1167 

I I I I I I I I I I I I I I M I II I I Ml I I I I I || | | | || Ml || | | I | | | | 

Db 8 92 T CAGAAAT G GGAT CAT C GT T CAGT GT C T C T CCAAAAG C AGAAT CT GC C GTAATAGTAGC A 951 

Qy 1168 AAC AC T AAGGAAGAAGT AAT T GT GAGGAGTAAA G AC AAAGAG GAT T T AGT T T G T AG T 1224 

II I M I M I I I I I I I I II I I I | | | | || | | | | | | || M I I I II I 

Db 952 AAT C CTAGGGAAGAAATAAT CGT GAAAAATAAAGAT GAAGAAGAGAAGTTAGTTAGTAAT 1011 

Qy 1225 GC AG C CCT T C AC AGT CC ACAAGAAT C AC CT GT GGGTAAAGAAGAC 12 69 

M I I I I I I I I I II I I I I I M I I I I M | | | M 

Db 1012 AAC AT CC T T CAT AAT CAACAAGAGT T AC CT AC AG CT C TT ACTAAAT T GGT T AAAGAGGAT 1071 

Qy 12 7 0 AGAGT T GT GT C T C CAGAAAAGACAAT G GAC AT T T T TAAT GAAAT G CAGAT GT C AGTAGT A 132 9 

I M I I I I I II I I I I I II III I I I I | | | || | | | | || | | | | | | | | 

Db 1072 GAAGT T GT GT CT T CAGAAAAAGCAAAAGAC AGTT T TAAT GAAAAGAGAGT T GC AGT GGAA 1131 

Qy 1330 GCAC CT GT GAGGGAAGAGTAT GCAGACTTTAAGCCATTT GAACAAGCAT GGGAAGT GAAA 1389 

II I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I || I I I I I | | | | | | | | 
Db 1132 GCT C CT AT GAGGGAGGAAT AT GCAGACTT C AAAC CAT T T GAGC GAGT AT G GGAAGT GAAA 1191 

Qy 1390 GAT ACT TAT GAGG GAAGTAGGGAT GT GCT GGC T G C TAGAG CT AATGTG 14 37 

I I I I I I I I I I I I I II I I I II I I I I I II I I || || 

Db 1192 GATA GTAAGGAAGATAGT GAT AT GTT GGCTGCT GGAGGTAAAAT CGAGAGCAACTT G 1248 

Qy 14 38 GAAAGTAAAGT GGACAGAAAATGCTTGGAAGATAGCCT GGAGCAAAAAAGT CT T GGGAAG 1497 

I I I M I I I I I I I I I I I I I I I | | | | | | | | | | | | | | | || | | | Ml | || 
Db 124 9 GAAAGTAAAGT GGATAAAAAAT GT T T T GC AGAT AG C CT T GAGCAAACT AAT C AC GAAAAA 1308 

Qy 14 98 GAT AGT GAAG G CAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C CAGAAC CT GT GAAG GAC 1557 

I I I I I I I I I I I I I I I I I I I I II II I I I I I II II II I I I I I I I I I I II I 
Db 1309 GAT AGT GAGAGT AGT AAT GAT GAT ACT T CT T T C C C CAGTAC G C C AGAAGGT AT AAAG GAT 1368 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCT CAGCAACCGAAAGCACCACA 1614 

I I I M I I I I I I I I I I I I I I I I I I II I I I M | | | | | | | | | | || 

Db 1369 C GT C C AGGAGC AT AT AT CAC AT GTGCTCCCTT TAAC C C AGCAG CAACT GAGAG CAT T GC A 1428 

Qy 1^15 G C AAAC AC TTTCCCTTTGT T AGAAG AT CAT AC T T C AG AAAAT AAAAC AGAT GAAAAAAAA 1674 

Ml Ill II 'Mi; || M | | M 

Db 1429 ACAAAC AT TTTTCCTTTGT TAG GAGAT C C T AC T T CAGAAAAT AAGAC C GAT GAAAAAAAA 14 88 



Qy 


1675 


AT AGAAGAAAGG AAG G C C C AAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

1 1 1 1 1 1 1 1 1 1 M II II Mill II II II MM II MM 

AT AGAAGAAAAGAAGGC CCAAATAGTAACAGAGAAGAATACT AGCAC CAAAACAT CAAAC 154 8 


Db 


1489 


Qy 


1732 


CCTTTCCTT GT AGCAGT ACAG GATT C T GAGGCAGAT TAT GT T ACAAC AGAT AC CT T AT C A 1791 

1 1 1 1 1 M 1 II M II 1 1 M 1 1 1 M M 1 1 1 1 1 II 1 III 1 1 Ml II 

CCTTTTCTT GT AG CAG C AC AGGATT CT GAGAC AGAT TAT GT C ACAAC AGAT AAT T T AAC A 1608 


Db 


1549 


Qy 


1792 


AAG GT GAC T GAG G CAGC AGT GT CAAAC AT GC CT GAAGGT CT GAC G C C AGAT T T AGTT C AG 1851 
1 1 II II 1 1 M 1 II II III 1 1 1 1 I I I I || II II 1 1 1 1 1 II II 1 1 II M 1 II Ml 
AAGGTGACTGAGGAAGTCGTGGCAAACATGCCTGAAGGCCTGACTCCAGATTTAGTACAG 1668 


Db 


1609 


Qy 


1852 


GAAGC AT GT GAAAGT GAAC T GAAT GAAGC C ACAGGTACAAAGAT T GCT T AT GAAACAAAA 1911 

M II II II M 1 1 1 II 1 1 1 1 1 M 1 M 1 1 M 1 M 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 | | | | 

GAAGC AT GT GAAAGT GAAT T GAAT GAAGT T ACT GGT ACAAAGAT T GCT TAT GAAACAAAA 172 8 


Db 


1669 


Qy 


1912 


GT G GACT T GGT C CAAAC AT C AGAAGC T AT ACAAGAAT CACTT TAC CC C AC AG C ACAGC T T 1971 

M M M M M II Mill Ml Mill 1 II 1 1 II 1 II 1 II 1 

AT G GACT T GGT T CAAAC AT C AGAAGT TAT GCAAGAGT CAC T CTAT CCT GC AG CACAGCT T 17 8 8 


Db 


1729 


Qy 


1972 


T GC C CAT CAT T T GAGGAAGCT GAAGCAACT CC GT CAC CAGT T T T G C CT GAT AT T GTT AT G 2 031 

N 1 II 1 1 1 II M 1 1 II 1 1 II II MMI 1 1 II 1 1 1 II M II 1 1 1 1 II 1 

T GC C CAT C AT TT GAAGAGT CAGAAGCT ACT CCT T CAC CAGT TT T GC CT GAC AT T GTT AT G 184 8 


Db 


1789 


Qy 


2032 


GAAGCACCATTA7^ATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 
1 1 II 1 II II 1 1 II II II II 1 1 1 1 1 II 1 1 1 II II 1 II 1 M 1 II 1 1 1 I 
GAAGCACCATTGAATTCTGCAGTTCCTAGT GCT GGT GCTTCCGT GAT ACAGC CCAGCTCA 1908 


Db 


1849 


Qy 


2092 


T C C C CAC T G GAAG C ACCT C CT C C AGT T AGT TAT GACAGT AT AAAG CT T GAG CCT GAAAAC 2151 

II IN 1 II II II II 1 M II 1 llllll II Mill 1 1 II II 1 II 1 II 1 1 

T CAC CAT T AGAAG CT T C T T C AGT TAAT TAT GAAAGC AT AAAAC AT GAG C CT GAAAAC 1965 


Db 


1909 


Qy 


2152 


C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AG CAC T AAAAGCTTT GGGAACAAAGGAA 220 8 

I 1 1 1 1 1 II 1 1 1 II M 1 1 M 1 M 1 1 1 1 M II II If 1 1 1 1 1 1 II II 1 II II 

CC C C CAC CAT AT GAAGAGGC CAT GAGT GTAT CACTAAAAAAAGTAT CAGGAATAAAGGAA 202 5 


Db 


1966 


Qy 


2209 


GGAATAAAAGAGC CT GAAAGT TT TAAT GCAGCT GT T C AGGAAAC AGAAGCT C CT T ATAT A 2268 

1 INI II 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 II 1 1 1 1 1 II 1 

GAAAT T AAAGAG CCT GAAAAT ATTAAT G CAGCT CT T C AAGAAACAGAAGCT C CT T AT AT A 2 085 


Db 


2026 


Qy 


2269 


T C CAT T GC GT GT GAT T TAAT TAAAGAAACAAAGC T CT C C ACT GAG C CAAGT C CAGATT T C 2 32 8 

II 1 1 1 M 1 1 M 1 M 1 1 1 1 1 II II II 1 II II 1 II II MM III Ml 1 M 1 1 1 

T C TAT T GC AT GT GAT T TAAT T AAAGAAACAAAGCT T T C T GCT GAAC C AG CT C C GGAT TT C 2145 


Db 


2086 


Qy 


2329 


T CT AAT TAT T CAGAAAT AG CAAAAT T C GAGAAGT C GGT GC C C GAAC AC GCT GAG CTAGT G 2 388 

1 M 1 II 1 II 1 II 1 1 1 1 1 II 1 M 1 1 1 1 1 1 1 II II 1 1 1 1 II 1 II 1 1 1 1 1 

T CT GAT TAT T CAGAAAT GGCAAAAGT T GAACAGC CAGT GC CT GAT CAT T CT GAG CT AGTT 2205 


Db 


2146 


Ov 


-J O _7 


<cr/-Uj<a/\i 1 LL 1 LALL i CjAAI C I (jAACCAGTT GACT TATTTAGT GAT GATT CGATTCCTGAA 24 4 8 
M II 1 II 1 1 1 1 1 II II 1 1 1 1 II 1 II II 1 1 II II 1 1 1 II II II | | || | | || || | || 
GAAGAT T C CT C AC CT GAT T CT GAAC CAGT T GACT TAT T T AGT GAT GAT T CAATAC C T GAC 22 65 


Db 


2206 


Qy 


2449 


GT C C C ACAAACACAAGAGGAGG CT GT GAT GCT CAT GAAGGAG AGT CT C ACT GA A 2502 

M II 1 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 II 1 1 1 II II M II 1 

GT T C C ACAAAAACAAGAT GAAAC T GT GAT GCT T GT GAAAGAAAGT C T C ACT GAGAC T T C A 232 5 


Db 


2266 



2503 GT GT CTGAGACAGTAGCC CAGCACAAAGAGGAGAGACTTAGTGC CT CACCT CAGGAGCTA 25 62 

I I HI I IN M I I I I I I I I I II I Ml III I 
2326 T T T GAGT CAAT GAT AGAAT AT GAAAAT AAG GAAAAAC T C AGT GCT T T G C C AC CT GAG GGA 2385 

2563 GGAAAGC CAT AT T T AGAGT CT T T T C AGC C CAAT T T AC AT AGT AC AAAAGAT GC T GC A 2619 

I I I I I M I.I ! I I I I | | | | I | 

238 6 G GAAAG C CAT AT T T GGAAT CT T T T AAG C T C AGT T TAGATAAC AC AAAAGAT AC C C T GTT A 2445 

262 0 T CTAAT GAC AT T C CAACAT T GAC CAAAAAGGAGAAAAT T T C TTT GCAAAT GGAAGAGT T T 2 679 

II I I I I II Mill || | | || || | | | || | || | | | | | Ml I 

244 6 C C T GAT GAAGT T T CAACAT T GAG CAAAAAG GAGAAAAT T C C TTT GCAGAT GGAGGAGC T C 2505 

268 0 AAT AC T G C AATT T AT T C AAAT GAT GAC T T AC T T T CT T CT AAGGAAGAC AAAAT AAAAGAA 2739 

I I I I I I I I I I I I I I I M II I I I II MINI I I II I || M 

250 6 AGT ACT GCAGTTTATT CAAAT GAT GACTTAT TTATTT CTAAGGAAGCACAGATAAGAGAA 2565 

2740 AG T G AAAC AT TTT C AG AT T CAT C T C C GAT T GAG AT AAT AG AT G AAT T T C C C AC G T T T GT C 2799 

I I I I I I I I I I M I I I I I I II I I II I I I I I || | | | || || | M II II II II 
2566 ACT GAAAC GTTTT CAGATTCAT CT CCAAT TGAAATTATAGAT GAGTT CCCT ACATT GAT C 2625 

2 8 00 AGTGCTAAAGATGATTC— TCCTAAATTAGCCAAGGAGTACACTGATCTAGAAGTATCC 2856 

H I I I I I I I I I I I I II I I II I III II I I I I I I I I I I II II I I I 

2626 AGT T CT AAAACT GATT CATT T T C TAAAT TAG C C AGGGAAT ATAC T GAC CT AGAAGT AT C C 2685 

2 857 GACAAAAGT GAAAT T G CTAAT AT C CAAAGC G GG GCAGAT T CAT TGCCTTGCT T AGAAT T G 2916 

IN M Ml M I I I I I I I I I || | | | M I II 

268 6 CACAAAAGT GAAAT T G CTAAT G C C C C GGAT GGAGCT GGGT C AT T GC C T T GC AC AGAAT T G 2745 

2 917 C CCT GT GAC CTTT CTTT CAAGAATAT ATAT C CTAAAGAT GAAG T AC AT GT T T C A 2970 

IN I I I I M I I I I I I I I I I II I I I I I I I I I | I I | | | 

2746 C C C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT C AGT TT CT C A 28 05 

2971 GAT GAAT T CT C CGAAAAT AG GT C CAGT GT AT CTAAGGC AT C CAT AT C GC CT T CAAAT GT C 3030 

I I I I I I I M I I I I I II I I I I I I I I I I I | | | 

28 06 GAT GACT TTT CT AAAAAT G G GT C T GC T AC AT CAAAGGT GCT CT T AT T GC C T C C AGAT GT T 2865 

3031 TCTGCTTTGGAACCTCAGACAGAAATGGGCAGCATAGTTAAATCCAAATCACTTACGAAA 3090 

I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I II I I I II I I I I I 
28 66 TCTGCTTT GGC CACT CAAG CAGAGAT AGAGAGC ATAGT TAAAC C CAAAGTT CT T GT GAAA 2925 

3091 GAAG CAGAGAAAAAAC T T C CT T CT GAC AC AGAGAAAGAGGACAGAT C C C T GT C AG CT GT A 3150 

I I I I I I I I M I I I I I I I I M I I M I I I I I II I I I I I I II II I I I || M I II 
2 926 GAAG CT GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAGGACAGAT C AC CAT CT G CT AT A 2 9 85 

3151 T T GT CAGC AGAG C T GAGTAAAACT T C AGT T GTT GAC CT C C T CT ACT G GAG AGAC AT T AAG 3210 

N I I I I M I I I I I I I M I I I II I I I I I II I I I II I I I I I I I f I I I I I I I I I I M I I I I 
2986 T T T T CAGCAGAG C T GAGT AAAAC T T C AGT T GTT GAC CT C CT GT ACT GGAGAGAC AT TAAG 3045 

3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

I N I I I I I I I I I I M I II I I I I I I I I I I I I I I I II I I I I I || I I I I I I I M I I I I 
304 6 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I M I I I I I I I I I I I I M I I M I I I I I I I I I I I M I I I I I I I I I I II I I I I I 
3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

3331 ATAT ATAAGGGC GT GAT C CAGGCTAT CC AGAAATCAGAT GAAGGC CAC CCATT CAGGGCA 3390 



Db 


3166 


1 1 1 1 1 Mill 1 1 II M 1 1 1 II 1 1 1 1 1 1 1 M II 1 II 1 1 II 1 1 1 II 1 II 1 II M 1 II II 

AT AT ACAAGG GT GT GAT C CAAG CT AT C C AGAAAT CAGAT GAAGGC C AC C CAT T C AG GGC A 


3225 


Qy 


3391 


TAT T T AGAAT C T GAAGTT GCT AT AT CAGAG GAAT T GGT T C AGAAAT AC AGTAAT T CT GCT 

Ml 1 1 1 1 1 1 II 1 1 II 1 1 II II I I | I | | I I II II | M | | || 1 1 1 1 1 I I I | | | | 1 II 

TAT C T G GAAT C T GAAGTT GC TAT AT CT GAG GAGT T GGT T CAGAAGT AC AGTAAT T CT GCT 


3450 


Db 


3226 


3285 


Qy 


3451 


CT T GGT CAT GT GAAC AGC ACAAT AAAAGAAC T GAGG C G GC T T T T CT T AGT TGAT GAT T T A 

M 1 1 II 1 1 II 1 1 II 1 MM II 1 M 1 1 II 1 II 1 1 1 II II 1 1 1 II 1 || | | M M II 

CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 


3510 


Db 


3286 


3345 


Qy 


3511 


GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 
M 1 1 II 1 1 1 1 1 II M 1 II 1 II 1 II II 1 II M 1 M 1 1 II 1 1 1 1 1 II II II 1 II 1 II I 
GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 


3570 


Db 


3346 


3405 


Qy 


3571 


AATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTAT 
M 1 II M 1 II 1 1 II II 1 1 II 1 II 1 1 1 II II II 1 1 1 II M 1 II M II M 1 II II 1 II 
7VATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 


3630 


Db 


3406 


3465 


Qy 


3631 


GAAC GGC AT CAGGT G C AGAT AGAT CAT TAT CT AG GACTT G CAAACAAGAGT GT T AAGGAT 
i i i i i i i i i i i i i i i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i » i i i i i > i ... 
1 ' 1 1 M M 1 1 1 1 1 II II II II II | | | || | M II 1 1 1 II II 1 II 1 1 II MINI III 

GAAC GGCAT CAGGCGCAGAT AGAT CATTAT CT AGGACTT GCAAATAAGAAT GTTAAAGAT 


3690 


Db 


3466 


3525 


Qy 


3691 


GC C AT GGC CAAAAT C C AAGCAAAAAT C C CT GGAT T GAAGC GCAAAGC AGA 374 0 

M 1 1 II 1 1 1 1 II II 1 1 1 1 1 1 II 1 II 1 1 II II II 1 1 M II 1 II II 1 II 

GCT AT GGCTAAAAT C CAAGCAAAAAT C CCT GGAT T GAAGC GCAAAGC T GA 3575 




Db 


3526 





RESULT 10 
AAF90324 

ID AAF90324 standard; cDNA; 3579 BP. 
XX 

AC AAF9032 4; 
XX 



DT 23-JUL-2001 (first entry) 
XX 

DE Human NOGO-A cDNA. 
XX 

KW NOGO-A; human; chromosome 2p21; neuropathy; spinal injury; brain injury; 

KW stroke; neuronal degeneration; Alzheimer ! s disease; Parkinson's disease; 

KW neuromuscular disorder; psychiatric disorder; developmental disorder; 

KW neuroprotective; nootropic; neuroleptic; antiparkinsonian; 

KW cerebroprotective; neuroleptic; diagnosis; therapy; ss. 
XX 

OS Homo sapiens. 
XX 

PN WO200136631-A1. 
XX 

PD 25-MAY-2 001. 
XX 

PF 14-NOV-2000; 2000WO-GB004345 . 



XX 

PR 15-NOV-1999; 99GB-00026995 . 

PR 24-JAN-2000; 2 OOOGB-00001550 . 
XX 

PA (SMIK ) SMITHKLINE BEECHAM PLC. 
XX 



PI Michalovich D, Prinjha R; 
XX 

DR WPI; 2001-343822/36. 

DR P-PSDB; AAB82349. 
XX 

PT New polypeptide designated NOGO-C is a splice variant of the human NOGO 

PT gene and may be useful in the treatment of neural disorders including 

PT Alzheimer's and Parkinson's diseases. 
XX 

PS Disclosure; Page 25-26; 25pp; English. 
XX 

CC The present sequence is that of cDNA encoding human NOGO-A (see 

CC AAB8234 9) . NOGO-A is a previously known splice variant of the human NOGO 

CC gene on chromosome 2p21. NOGO-A cDNA was obtained by PCR amplification of 

CC human spinal cord cDNA. The invention relates to a novel splice variant, 

CC NOGO-C (see AAF90323). It provides NOGO-C polypeptides and 

CC polynucleotides, and methods for producing such polypeptides by 

CC recombinant techniques. Also disclosed are methods for utilising NOGO-C 

CC polypeptides and polynucleotides in the treatment of diseases including 

CC neuropathies, spinal injury, brain injury, stroke, neuronal degeneration, 

CC for example Alzheimer's disease and Parkinson's disease, neuromuscular 

CC disorders, psychiatric disorders and developmental disorders. Also 

CC provided are methods for identifying agonists and agonists for use in 

CC treating conditions associated with NOGO-C imbalance, and diagnostic 

CC assays for detecting diseases associated with inappropriate NOGO-C 

CC activity or levels 

XX 

SQ Sequence 3579 BP; 1074 A; 803 C; 812 G; 890 T; 0 U; 0 Other; 

Query Match 61.2%; Score 2289.2; DB 4; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19; 
Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

MINIM MM I I I II I II | M | | | | | | | | | M | || 

Db 1 AT GGAAGAC C T G GAC C AGT CT C C T C T GGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAGC C C GAGGAC GAGGAG GAC GAGGAGGAG 372 

I MINI I II II N I I II I I I II I I I I I || I I I | | | || | | || | | | | | || IN 
Db 58 C AGC C C G C GTT CAAGT AC C AGT T C GT GAGGGAGC C C GAGGAC GAG GAG GAAGAAGAG 114 

Qy 373 GAGGAGGAC GAG GAGGAG GAC GAC GAG GAC C TAGAG GAAC T GGAG GT GCT G GAGAGGAAG 432 

MINIM II I I I II I II II II I II I I I Mill II II II I II I I I II I I I I I I I 
Db 115 GAGGAGGAAGAGG AGGAC GAG GAC GAAGAC CT GGAGGAGCT G GAGGT GCT GGAGAGGAAG 174 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 48 6 

I M M M I I I I I I I II I Mill II I II I II 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

M I I M I I I I I II I I I I I I II I I I I I I I II I I I II I I I II I 

Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG— CGGCGCCC 5 97 

N I I II II I M II II II I I I II II I I II II I I I I I II I 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 



Qy 


598 


GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 

1 II II 1 1 1 1 M 1 1 1 1 t 1 II 1 1 1 1 1 1 1 1 I M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 


657 


Db 


355 


414 


Qy 


658 


CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 
1 M 1 M 1 1 1 1 I I || | | 1 1 1 M II III III 1 1 | | | | | | Ml | | | 
CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 


711 


Db 


415 


474 


Qy 


712 


1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 | | 1 1 1 1 1 1 I I M 1 1 1 1 1 M 1 
TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 


750 


Db 


475 


534 


Qy 


751 


AGGGGCTCC— GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 II 1 II 1 1 II 1 1 I I I I | | | | | | | | || | | | M 1 1 1 1 II 
AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 


807 


Db 


535 


594 


Qy 


808 


GT GAT AC CCTCCTCT GC AGAAAAAATT AT G GAT T T GAT G GAGCAGC C AGGTAAC AC T GT T 

NIMH 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 MM 1 1 M 1 1 II 1 1 M II 1 1 1 1 1 II 

GT GAT AC GC T C CT C T GC AGAAAA T AT G GAC T T GAAG GAG CAGCC AGGTAAC ACT ATT 


867 


Db 


595 


651 


Qy 


868 


TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 
N 1 1 1 M II II II 1 II II 1 1 1 1 II 1 II II 1 1 II 1 II II 1 1 1 1 II 1 1 1 || | || M 1 1 II 
TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 


927 


Db 


652 


711 


Qy 


928 


CT AT CTCCTCTCT CAACT GT T T C T T T T AAAGAAC AT GGAT AC CT T GGT AACT T AT CAGC A 

M M 1 1 1 II 1 II 1 1 1 I I | | | M 1 M 1 II 1 1 M II M 1 || || | M 

CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 


987 


Db 


712 


771 


Qy 


988 


GT GT CAT C C T CAGAAG GAACAAT T GAAGAAAC T T TAAAT GAAGCT T C T AAAGAGTT GC C A 

' 1 1 M 1 M 1 II 1 1 II II 1 M 1 1 1 1 1 1 1 1 | | || 1 1 1 1 1 M 1 II 1 II 

GT AT T AC C C ACT GAAGGAAC AC T T CAAGAAAAT GT CAGT GAAGC T T C T AAAGAGGT CT C A 


1047 


Db 


772 


831 


Qy 


1048 


GAGAGGG CAACAAAT C CAT T T GTAAAT AGAGAT T T AGCAGAAT T T T C AGAAT T AGAAT AT 

1 1 1 1 1 1 1 1 1 II II 1 1 M 1 1 1 M 1 1 1 1 1 1 II II || 1 1 II 1 1 M 1 1 1 1 1 1 1 

GAGAAGG CAAAAACT CT ACT CAT AGAT AGAGAT T TAACAGAGT T T T C AGAAT T AGAAT AC 


1107 


Db 


832 


891 


Qy 


1108 


T C AGAAAT G GGAT CAT CT T T T AAAG GC T C C C CAAAAGGAGAGT CAGC CAT AT T AGT AGAA 

1 1 1 1 1 1 1 1 i M 1 1 1 1 1 1 II 1 1 II 1 1 1 II 1 1 1 III II Ml M MIMI 1 

T CAGAAAT G G GAT CAT C GT T CAGT GT C T CT C CAAAAGCAGAAT CT GC CGT AAT AGT AG C A 


1167 


Db 


892 


951 


Qy 


1168 


AAC ACT AAGGAAGAAGT AAT T GT GAGGAGT AAA GACAAAGAGGATTTAGTTT GTAGT 

N 1 M 1 1 1 1 1 1 1 1 1 1 1 M 1 1 Mill II 1 1 1 1 1 1 1 1 II II 1 1 1 1 

AAT C CT AGGGAAGAAATAAT CGT GAAAAATAAAGAT GAAGAAGAGAAGT T AGT T AGTAAT 


1224 


Db 


952 


1011 


Qy 


1225 


GC AGCC CT T CAC AGT C CACAAGAAT C AC CT GTGGGTAAAGAAGAC 

1 1 M 1 1 1 1 1 1 1 1 1 1 I I | | | | 1 1 1 1 1 1 1 1 1 1 1 
AACATC CTT CATAAT CAACAAGAGTTACCTACAGCT CTTACTAAATT GGTTAAAGAGGAT 


1269 


Db 


1012 


1071 


Qy 


1270 


AGAGTT GT GT CT CCAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGATGT CAGT AGT A 

1 1 1 1 1 1 1 1 II 1 1 III 1 I I I | || | | | || M | | | | | | | | | 

GAAGT T GT GT CT T CAGAAAAAG C AAAAGAC AGT T T TAAT GAAAAGAGAGTT GC AGT GGAA 


1329 


Db 


1072 


1131 


Qy 


1330 


GC AC CT GT GAGGGAAGAGT AT G CAGAC T T T AAG C CAT T T GAACAAGC AT GGGAAGT GAAA 

II IN 1 M 1 1 1 1 II | | | M M 1 1 1 1 II 1 1 1 1 1 1 | | | || 1 | | | 1 | | | | 1 1 1 1 

GCT C CT AT GAGG GAG GAAT AT GCAGAC T T C AAAC CAT T T GAGC GAGT AT GGGAAGT GAAA 


1389 


Db 


1132 


1191 


Qy 


1390 


GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 


1437 



Db 


1192 


Qy 


1438 


Db 


1249 


Qy 


1498 


Db 


1309 


Qy 


1558 


Db 


1369 


Qy 


1615 


Db 


1429 


Qy 


1675 


Db 


1489 


Qy 


1732 


Db 


1549 


Qy 


1792 


Db 


1609 


Qy 


1852 


Db 


1669 


Qy 


1912 


Db 


1729 


Qy 


1972 


Db 


1789 


Qy 


2032 


Db 


1849 


Qy 


2092 


Db 


_L _7 U j 


Qy 


2152 


Db 


1966 


Qy 


2209 



MM I M I I I I I I I I I I I | I I I I I I I I I | || M 

GATA GTAAGGAAGATAGTGATATGTTGGCTGCTGGAGGTAAAATCGAGAGCAACTTG 124 8 

GAAAGT AAAGT GGAC AGAAAAT GCT T GGAAGATAGC C T GGAG CAAAAAAGT CT T GGGAAG 14 97 

M I M M M M M I I I I I I I I I | | | | | | | | | M M I I I I I I II I M 

GAAAGT AAAGT GGATAAAAAAT GTT T T G CAGATAG C C T T GAG CAAACT AAT CAC GAAAAA 1308 
GAT AGT GAAG GC AGAAAT GAGGAT GCTT CT T T C C C C AGT AC C C CAGAAC CT GT GAAGGAC 1557 

I M M M I I M M M I Ml M M M M M M I M I M M M I I I II II 

GAT AGT GAGAGT AGTAAT GAT GAT AC TTCTTTCCC C AGT AC G C C AGAAGGT AT AAAGGAT 1368 
AGCT C CAGAG CAT AT AT T AC CT GTGCTTCCTT TAC C T CAGCAACCGAAAGCACCACA 1614 

I I M M M M M M M M M M M I I I M M M I M I I I I II 

C GT C C AGGAGC AT ATAT CAC AT GTGCTCCCTT TAAC C C AGCAG CAAC T GAGAGC AT T GC A 1428 
GCAAAC AC TTTCCCTTT GT T AGAAGAT CAT AC T T CAGAAAATAAAAC AGAT GAAAAAAAA 1674 

MUM Ml II II II I I I I I I | | | | | | || | | | || i | | | | || | | | | | | | || | || 

ACAAACATTTT TCCTT T GTTAGGAGAT C CTACTTCAGAAAATAAGAC C GAT GAAAAAAAA 14 88 



M I II M I I II II II I I I I I I I I I I I I I || | | | | | | | | | | 

AT AGAAGAAAAGAAGG C C C AAAT AGT AAC AGAGAAGAAT AC TAG CAC CAAAAC AT CAAAC 154 8 
C C T T T CCT TGT AGC AGT ACAGGAT T CT GAG GCAGAT TAT GT T ACAAC AGAT AC CT T AT C A 1791 

Mill II II I I I I I I I I I II II II I I II I I 1 I I I I | | I II I I II II I I I M II 

C C TT T T CT T GT AGC AGC ACAGGAT T CT GAGAC AGAT TAT GT CAC AAC AGAT AAT T TAAC A 1608 

AAGGT GACT GAG G C AGC AGT GT C AAACATGC CT GAAGGT C T GAC G C C AGAT T T AGT T CAG 1851 
M I II II I II I I I II Ml I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I III 
AAGGT GAC T GAG GAAGT C GT G GCAAAC AT GC CT GAAGG C CT GACT C C AGAT TT AGT AC AG 1668 

GAAGCAT GT GAAAGT GAAC T GAAT GAAGC C AC AGGT ACAAAGAT T GC T TAT GAAACAAAA 1911 

M I I I I I I I I II I II I II II I I II I I I II II II I I I || || | | | | || | | | | | | | | | | 

GAAGCAT GTGAAAGT GAATT GAAT GAAGTTACT GGT ACAAAGAT T GCTTAT GAAACAAAA 172 8 
GT GG AC T T G GT C CAAAC AT C AGAAGCT AT ACAAGAAT CACT T TAC CC CAC AGC AC AGC TT 1971 

MM II I I II I II I I I I I I I I I I II I I I I I II II I I I I I I || | | | 

AT G GAC T T GGT T CAAAC AT CAGAAGT TAT GCAAGAGT CAC T C TAT CCT GC AGC AC AGC T T 17 8 8 
T GC C CAT CAT T T GAG GAAG CT GAAGCAACT C C GT CAC C AGT T TT GC CT GAT AT T GT T AT G 2031 

M M I I I I II II I I II I | | | | | | | M I I I I I I I I I I I I I I I I II I I II II I I I 

TGCCCATCATTTGAAGAGTCAGAAGCTACTCCTTCACCAGTTTTGCCTGACATTGTTATG 184 8 
GAAGCACCATTAAATTCTCTCCTTCCAAGCGCT GGT GCTT CTGTAGTGC AGC CCAGTGTA 2 091 

M I I I II I I II II I II I I I II II II M I M II M II I I I I I I I I I I 

GAAGCACCATT GAATT CTGCAGTTCCTAGT GCT GGT GCTT CCGT GAT AC AGC CCAGCTCA 1908 
T C C C CACT GGAAGC AC CT C CT C C AGT T AGT TAT GAC AGT ATAAAGC T T GAG CCT GAAAAC 2151 

M Ml I I II I II II I I I II I I I II I I II I I I I I I I I I I I I I I I I I I I 

T CAC CAT T AGAAG C T T C T T CAG T T AAT TAT GAAAG C AT AAAAC AT GAG CCT GAAAAC 1965 

C C C C CAC CAT AT GAAGAAGC C AT GAAT GT AG CACT AAAAG C T T T G G GAAC AAAGGAA 2208 

M M I I M II II I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | 

C C C C CAC CAT AT GAAGAGG C CAT GAGT GT AT C ACTAAAAAAAGT AT CAGGAAT AAAGGAA 2025 
GGAAT AAAAGAGC C T GAAAGT T T T AAT G C AGC T GTT CAGGAAAC AGAAG CT C CT T AT ATA 2268 

I I M I I I I M I II I II I I M II II II I II II I I I I II II I I I I I I II II I I I I I 



Db 


2026 


GAAAT TAAAGAG C CT GAAAAT ATTAAT G C AGC T C T T CAAGAAAC AGAAGC T C CTT AT AT A 2085 


Qy 


2269 


T C CAT T GC GT GT GATT T AAT TAAAGAAACAAAGCT C T C C ACT GAG C CAAGT C C AGAT T T C 232 8 

M 1 1 1 1 1 1 1 1 M I 1 1 1 1 1 1 II 1 1 1 1 1 1 IN 1 II 1 1 1 1 1 II 1 1 1 1 1 1 

T CT AT T GCAT GT GAT T TAAT TAAAGAAACAAAGCT T T CT GC T GAAC C AG CT C C GGATT T C 2145 


Db 


2086 


Qy 


2329 


T CTAAT TAT T C AGAAAT AG CAAAAT T C GAGAAGT CGGTGCCC GAAC AC G CT GAGC TAGT G 2388 

1 N II II 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II | I M 1 1 1 M 1 1 1 1 

T CT GAT TAT T C AGAAAT G G CAAAAGTT GAAC AGC C AGT G C C T GAT CAT T C T GAG CTAGT T 22 05 


Db 


2146 


Qy 


2389 


GAG GAT T CCT CAC CT GAAT CT GAAC C AGT T GACT T AT T TAGT GAT GAT T C GATT C CT GAA 244 8 

II 1 M 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | || | | | | || | || || | | | | | || | | | | | 

GAAGAT T CCT CAC CT GAT T CT GAAC C AGT T GACT TAT T TAGT GAT GATT C AAT AC CT GAC 2265 


Db 


2206 


Qy 


2449 


GT C C CACAAACACAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GA A 2 5 02 

1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 I I II 1 1 

GT T C CACAAAAACAAGAT GAAAC T GT GAT GC T T GT GAAAGAAAGT CT CAC T GAGACTT C A 2325 


Db 


2266 


Qy 


2503 


GT GT CT GAGACAGTAGC C CAGCACAAAGAGGAGAGACTTAGT GCCTCAC CT C AG GAGC T A 2562 

1 1 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 M 1 1 1 III III 1 

TTT GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT GCTTT GC CACCT GAGGGA 2385 


Db 


2326 


Qy 


2563 


GGAAAGC CAT AT T T AGAGT CT T T T C AGC C CAAT T T AC AT AGT ACAAAAGAT GC TGCA 2619 

1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 

G GAAAG C CAT AT T T G GAAT C T T T T AAG C T CAGT T T AGAT AAC AC AAAAG AT AC C C T GT T A 2445 


Db 


2386 


Qy 


2620 


T C TAAT GAC AT T C CAACAT T GAC CAAAAAG GAGAAAAT TT CT T T GCAAAT G GAAGAGT TT 2 679 

II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 1 1 Ml | 

CCT GAT GAAGT TT CAACAT T GAGC AAAAAGGAGAAAAT T C CT T T GCAGAT GGAG GAGCT C 2505 


Db 


2446 


Qy 


2680 


AAT ACT GCAAT TT AT T CAAAT GAT GACT TACT T T C T T CT AAGGAAGAC AAAATAAAAGAA 2 739 
1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 I | | | | | | | | | 
AGT ACT G C AGT TT AT T CAAAT GAT GACT TAT T TAT T T CT AAGGAAGC AC AGATAAGAGAA 2 565 


Db 


2506 


Qy 


2740 


AGT GAAAC AT TTT C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT C 2799 

1 MINI 1 1 1 M 1 1 II 1 1 1 1 1 1 1 1 Mill II 1 1 1 1 1 II 1 II || || || || 

ACT GAAAC GT T T T CAGATT CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C C T AC AT T GAT C 2 625 


Db 


2566 


Qy 


2800 


AGT GCTAAAGAT GAT T C T C CTAAAT TAG C CAAG GAGT AC AC T GAT CTAGAAGT AT C C 2 8 56 

1 1 1 1 1 Ill I I I | | | | | | | | M | | | | | | | | | | | | | | || | | | || | 

AGT T CT AAAACT GAT T CAT TTT CTAAAT TAG C C AGG GAAT AT AC T GAC C T AGAAGT AT C C 2685 


Db 


2626 


Qy 


2857 


GACAAAAGT GAAAT T GCT AAT AT C CAAAG C GGGG CAGAT T CAT T GC CT T GCT T AGAAT T G 2916 

1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 I I M II 1 1 1 1 1 1 1 1 

CACAAAAGT GAAATT GCTAAT G CC C C GGAT GGAGCT GGGT CAT T GC CT T GC AC AGAAT T G 274 5 


Db 


2686 


Qy 


2917 


C C C T GT GAC CTTTCTTT C AAGAAT AT AT AT C C T AAAG AT GAAG T AC AT GT T T C A 2970 

IN 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 | Mill 

C C C CAT GAC CTTTCTTT GAAGAAC AT AC AAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 2 805 


Db 


2746 


Qy 


2971 


GAT GAATT CT C C GAAAAT AG GT C CAGT GT AT C TAAGGC AT C CAT AT C GC CT T CAAAT GT C 3030 

1 1 II 1 II II 1 1 II 1 Mil 1 III iiii i mi i i t i II i i i i 
i i i i i ii ii i i i i i I | t 1 1 III 1 1 1 1 | Ml I I || || IIII 

GAT GACTT T T CTAAAAAT GG GT CT GC T AC AT CAAAGGT GCT C TT AT T G C CT C CAGAT GTT 2 8 65^ 


Db 


2806 


Qy 


3031 


TCTGCTTTG GAAC CT CAGAC AGAAAT G G GC AG C ATAGT TAAAT C CAAAT C ACT T AC GAAA 3090 

1 1 1 1 1 1 II 1 1 IIII 1 II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 | | | IN IIII 
T CT GCTTT GGC CACT CAAGCAGAGAT AGAGAG CAT AGT TAAACC CAAAGT T CTT GT GAAA 2925 


Db 


2866 



Qy 3 091 GAAGCAGAGAAAAAACTT C CTT CT GACACAGAGAAAGAGGACAGAT C CCT GT CAGCT GT A 3150 

M I II II I I I I I I M I I I II I I II Mill I I I I II I II I | || | | || Ml | | 
Db 292 6 GAAGCT GAGAAAAAACTT CCTT C CGATACAGAAAAAGAGGACAGAT CAC CATCTGCTAT A 2 985 

Qy 3151 T T GT C AG C AGAGC T GAGTAAAACT T CAGT T GT T GAC C T C C T C T AC T GGAGAGAC AT T AAG 3210 

II I II I I I I I I I I I M I I II I I I I I I I I II I I I I I I I I I I I I I I | | I | | | I I I | | | || 
Db 2986 TT T T C AG C AGAGC T GAGTAAAACT T CAGT T GT T GAC C T C C T GT AC T G GAGAGAC AT T AAG 3045 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 32 7 0 

I I I I I M I I I I I I I I I I I I I II II I I I I II I I I I I I I I I I II I M I I I I I I I I I I 
Db 304 6 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

Mill II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I I I M I I 
Db 3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

Qy 3331 AT AT AT AAG GG C GT GAT C CAGGC T AT C C AGAAAT CAGAT GAAGGC C AC C CAT T C AG GGC A 3390 

I I I M Mill I I I II I II II I I I II I I II II I I I II I I I I I I || I I I M I M I I I I I 
Db 3166 ATATACAAGGGTGTGATCCAAGCTATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCA 3225 

Qy 3391 TATTT AGAAT CT GAAGTT GCTATATCAGAGGAATT GGTT CAGAAATACAGTAATTCT GCT 3450 

Ml I I M I I II II I I I I I I II I I I II II I I I I II I I I I I I M I I I I I I I I I I M I 
Db 3226 T AT CTGGAATCT GAAGTT GCTATATCTGAGGAGTTGGTTCAGAAGTACAGTAATTCT GCT 3285 

Qy 3451 CTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTA 3510 

M I II I II I M I I I I MM I II II I I I II I I I I I M II II I I M I I I I I M I II 
Db 3286 CT T GGT CAT GT GAACT G CAC GAT AAAGGAACT C AGG C GCC T CTT CT T AGT T GAT GATT TA 3345 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I M I I I I I I I I I M M I M M I I I I I I I I I M I I II I I I I I I II I II II M M I I I 
Db 3346 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 34 05 

Qy 3571 AAT GGT CT GAC ACT AC T GAT T T TAGC T CT GAT CT CAC T CT T CAGT AT T C CT GT T AT TT AT 3630 

I I M II I I I I I I II II I I I I I I I I I I I I I M I II M 

Db 3406 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3465 

Qy 3631 GAAC GG CAT C AG GT G CAGAT AGAT CAT TAT C TAG GACT T GCAAACAAGAGT GT TAAGGAT 3690 

M M II II I I I I I I I I I I II II I I I I I I M I I II I II I I I I II MM I I M M III 

Db 34 66 GAACGGCAT CAGGCGCAGATAGAT CATT AT CTAGGACTT GCAAATAAGAAT GT TAAAGAT 3525 

Qy 3691 GC C ATGG C CAAAAT C CAAG CAAAAAT C C C T GGAT T GAAG C GCAAAG C AGA 374 0 

I I M I M II I I I I I I II I I I I I M I M I II II II I I II M I M II II 
Db 3526 GC T AT G GC T AAAAT C CAAGCAAAAAT C C C T G GAT T GAAG C G CAAAGCT GA 3575 

RESULT 11 
ABK90134 

ID ABK90134 standard; DNA; 3579 BP. 
XX 

AC ABK90134; 
XX 

DT 21-OCT-2002 (first entry) 
XX 

DE DNA encoding human NogoA protein. 
XX 

KW Human; Nogo; BACE; acute neuronal injury; spinal injury; head injury; 



KW stroke; peripheral nerve damage; neoplastic disorder; glioblastoma; 

KW neuroblastoma; hyperprolif erative disorder; dysprolif erative disorder; 

KW cirrhosis; psoriasis; keloid formation; fibrocystic condition; cancer; 

KW tissue hypertrophy; central nervous system; axon regeneration; No go A; 

KW Nogo-associated disease; metastasis; gene; ds . 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .3579 

FT /*tag= a 

FT /product= "Human NogoA protein" 

XX 

PN WO200257483-A2. 
XX 

PD 25-JUL-2002. 
XX 

PF 18-JAN-2002; 2002WO-GB000228 . 
XX 

PR 18-JAN-2001; 2001GB-00001312 . 
XX 

PA (GLAX ) GLAXO GROUP LTD. 

PA (SM1K ) SMITHKLINE BEECHAM PLC. 

XX 

PI Blackstock WP, Hale RS, Prinjha R, Rowley A; 
XX 

DR WPI; 2002-599722/64. 

DR P-PSDB; ABG30938. 
XX 

PT Identifying modulators of Nogo or BACE activity for treating acute 

PT neuronal injuries, neoplastic or dysprolif erative disorders, comprises 

PT providing and monitoring interaction between Nogo and BACE polypeptides. 

XX 

PS Disclosure; Page 53-58; 68pp; English. 
XX 

CC The present invention relates to a new method of identifying modulators 

CC of Nogo function or BACE activity. The method involves providing Nogo and 

CC BACE polypeptides capable of binding with each other, monitoring the 

CC interaction between these polypeptides, and determining if the test agent 

CC is a modulator of Nogo or BACE activity. The method is useful in treating 

CC acute neuronal injuries, such as spinal or head injury, stroke, 

CC peripheral nerve damage, and in neoplastic (e.g. glioblastomas, 

CC neuroblastomas), hyperprolif erative or dysprolif erative disorders (e.g. 

CC cirrhosis, psoriasis, keloid formation, fibrocystic conditions, tissue 

CC hypertrophy) of the central nervous system. The BACE polypeptide is 

CC useful in screening methods to identify agents that may act as modulators 

CC of BACE activity and in particular agents that may be useful in treating 

CC Nogo-associated diseases. The modulators of Nogo or BACE polypeptides, 

CC and the polynucleotide encoding the BACE polypeptide are useful in 

CC manufacturing a medicament for the treatment or prevention of disorders 

CC responsive to the modulation of Nogo activity, in alleviating the 

CC symptoms or improving the condition of a patient suffering from this 

CC disorder, in axon regeneration, or in preventing metastasis or spreading 

CC of a cancer. The polynucleotide may also be an essential component in 

CC assays, a probe, in recombinant protein synthesis, and in gene therapy 

CC techniques. The present nucleic acid sequence encodes the human NogoA 

CC protein of the invention 



XX 

SQ Sequence 3579 BP; 1074 A; 803 C; 812 G; 890 T; 0 U; 0 Other; 



Query Match 61.2%; Score 2289.2; DB 6; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19 

253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I MM I I I I I I || I | | | | | | | | | | | | | M 

Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

313 CCGCCCGCCTT C AAGT AC CAGT T CGT GAC GGAG C C C GAG GAC GAGGAGGACGAGGAGGAG 372 

I I M I I I I I I I I I I I I I I I I M I I I I I I I I I I I M || Ml 

Db 58 CAGC C C G C GT T CAAGT AC CAGT T C GT GAGGGAG C C C GAG GAC GAGGAG GAAGAAGAG 114 

Qy 373 GAGGAGGAC GAG GAG GAG GAC GAC GAGGAC CT AGAGGAACT GGAGGT GC T GGAGAG GAAG 4 32 

M M I I I I I I I I I I I I II | | | | | | || | | || || | | | | || | | || | M | | 

Db H5 GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 174 

Qy 4 33 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

Mill M MINIMI I I I I I I II II I I I I I II I I I I I I I 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 2 34 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

M II II I I I I MM II M I I I I I I I I I M || I I I I M I M I I || i i i 

Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 2 94 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I M I I I I I I I I I II I I | | | | | | | | M I II I I I I I I I 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

M M I M I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I || I I m I I I I 

Dt> 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 
Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M M I M I I I I I II I I I I I M II III III I M Ill III 

Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 7 12 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

M I I I I I I I II I I I I I I I I I || I I I I | | | | | | | | M | 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

M M M M I II II I I M I I I I II I I I I I I I I I I I I || I I I I I I I I I I I I I I I I I I || 

Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 8 08 GT GAT AC C C T C C T CTGC AGAAAAAAT TAT G GAT T T GAT G GAGC AGCC AG GT AAC ACT GT T 867 

M I M M I M II I I II I I I I I I I I I I | I MM I I I I I I I I II I II I I I 

Db 595 GTGATAC GCT CCT CTGC AGAAAA TAT GGACTT GAAGGAGC AGCC AGGTAACACT ATT 651 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml M I I I II I I I || I I || | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | M I I 
Dt> 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 92 8 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 
M M II II II I | || | | | | | I I II I I I II II II || I | | | M | | I || 



712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

98 8 GT GT CAT C C T CAGAAGGAACAAT T GAAGAAAC T TT AAAT GAAG C T T C T AAAGAGT T GC C A 1047 

N I I M I I I II I I I I I II I I I I I I I I I II I I II I I I I I I I I I I | || 
772 GT ATT AC C CACT GAAGGAACAC T T C AAGAAAAT GT CAGT GAAG C T T CT AAAGAG GT CT C A 831 

1048 GAGAG G G CAACAAAT C CAT T T GT AAAT AGAGAT T T AGC AGAAT T T T C AGAAT T AGAAT AT 1107 
I I I I I I I I I II M I I M I I I I I I I I I I I I I I I I II I | I || | | | M | | | | 
832 GAGAAG GCAAAAACT CT ACT CAT AGAT AGAGAT T TAACAGAGT T T T CAGAAT T AGAAT AC 891 

1108 T C AGAAAT GG GAT CAT CT T T T AAAG G C T C C C C AAAAG GAGAGT C AG C CAT AT T AGT AGAA 1167 

I I I M I I I I I I M I I II I I I I Ml I I I I M I Ml II IN || | | | | M I 

892 T CAGAAAT GGGAT CAT C GT T CAGT GT C T CT C CAAAAG CAGAAT CT GC C GT AAT AGT AGC A 951 

1168 AAC AC TAAGGAAGAAGT AAT T GT GAGGAGT AAA G AC AAAGAG GAT T T AGT T T GT AGT 1224 

II I I I I I I I I I I I I I I I I I I I I I I I || | || | | | | M | | | | | | | 

952 AAT C C T AG GGAAGAAAT AAT CGT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGTAAT 1011 

122 5 GC AG C C C T T C AC AGT C C AC AAGAAT C AC C T GT GGGTAAAGAAGAC 1269 

I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I 

1012 AACATCCTTCATAATCAACAAGAGTTACCTACAGCTCTTACTAAATTGGTTAAAGAGGAT 1071 

1270 AGAGT T GT GT CT C CAGAAAAGACAAT GGAC AT TT T TAAT GAAAT G C AGAT GT CAGT AGT A 1329 

I I I I I I MINI III I I I I I I I I I I I I I I I | | I | | | | | 

1072 GAAGT T GT GT CT T CAGAAAAAGCAAAAGAC AGTT T TAAT GAAAAGAGAGT T G CAGT GGAA 1131 

133 0 GC AC C T GT GAGGGAAGAGT AT GCAGAC TT TAAG C CAT TT GAACAAG CAT G GGAAGT GAAA 1389 

M Ml I I I I I I I II I I I I I I I I I II II MINIM I II I II I II I M I I I I 
1132 GCT C C TAT GAGGGAGGAAT AT GCAGAC T T CAAAC CAT TT GAGCGAGT AT G GGAAGT GAAA 1191 

1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I M I I II I II I II I I I I I I I I I | | | M I || || 

1192 GATA GTAAGGAAGATAGT GAT AT GTT GGCTGCT GGAGGTAAAAT C GAGAGCAACTT G 124 8 

1438 GAAAGTAAAGT GGACAGAAAAT GCTT GGAAGATAGCCTGGAGCAAAAAAGT CT T GGGAAG 1497 

I I I M I I M II II I I I II I M I I I || I M I I II II II II I III | || 
124 9 GAAAGTAAAGT GGATAAAAAAT GT T T T GC AGAT AGC CT T GAG CAAAC TAAT C AC GAAAAA 1308 

14 98 GAT AGT GAAG G CAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAG GAC 1557 

I I II II M I II II I II II I I II I I I I I II II I || I | | | | | | | | Mill 
1309 GAT AGT GAGAGT AGT AAT GAT GAT AC TTCTTTCCC CAGT AC G C C AGAAGGT ATAAAGGAT 1368 

1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCT CAGCAACCGAAAGCACCACA 1614 

I I M I I M I II I II II II I I I II M I I II II I II II I I II II 

1369 C GT C C AGGAGC AT AT AT C ACAT GTGCTCCCTT T AAC C C AGC AGCAACT GAGAGC AT T G C A 142 8 

1615 G CAAACACT TTCCCTTT GT T AGAAGAT CAT ACT T CAGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I M I I I III II I I II I I I I I I I II II I II I II II || | || M II II II II I I II 
1429 ACAAAC AT TTTTCCTTTGT T AGGAGAT C CTACT T CAGAAAAT AAGAC C GAT GAAAAAAAA 14 88 

1675 AT AGAAG AAAG GAAG G C C C AAAT T AT AAC AG AGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

M M M M I I I M I II I || I I I I I | M | | || | | | | || M I M II I I I II II 
14 89 AT AGAAGAAAAGAAGGC C CAAAT AGT AAC AG AGAAGAAT ACT AGC AC CAAAAC AT CAAAC 154 8 

17 32 CCTTTCCTT GT AG CAGT AC AG GAT T C T G AGGCAGAT TAT GT T ACAAC AGAT AC C T TAT C A 17 91 

Mill I II M II II I II II M II I II I I I II I II II || II I I II II M III II 
1549 CCTTTTCTT GT AGCAGCACAG GAT T C T GAGAC AGAT TAT GT C ACAAC AGAT AAT T T AAC A 1608 



1792 AAG GT GAC T GAGG C AG C AGT GT CAAAC AT GCCT GAAG GT C T GAC GC C AGAT T T AGT T C AG 1851 

I I I I I I I I I I I I I II III I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I | | | 
1609 AAGGT GACT GAGGAAGTC GT GGCAAACAT GCCT GAAGGC CT GACT C C AGAT TT AGT AC AG 1668 

1852 GAAGCAT GT GAAAGT GAACT GAAT GAAGC CACAGGTACAAAGATTGCTTATGAAACAAAA 1911 

I I I I I I I I I I I I I I I I I I I I | | || | | M | | || | | | | | | | M | | | || 

1669 GAAGCAT GT GAAAGT GAAT T GAAT GAAGT T ACT GGTACAAAGATT GCTT ATGAAACAAAA 1728 

1912 GT G GAC T T GGT C CAAAC AT C AGAAG C T AT AC AAGAAT CAC T T T AC C C CAC AG CAC AGCT T 1971 

I I I I I I I I I I I M I I I I I I I I I I Ml I I I I I I I I I I || || | | | M II II I I 
172 9 AT GGACT T G GT T CAAACAT CAGAAGT TAT G C AAGAGT CAC T CT AT C C T GC AGCAC AGCT T 1788 

1972 T GC C CAT CAT T T GAG GAAGCT GAAGC AACT C C GT CAC CAGT T T T GC CT GAT AT T GTT AT G 2031 

Ml I I I I I II I I I I I I Mill I II II I I I II I II M I I II I I I I II I 

178 9 T GC CC AT CAT T T GAAGAGT CAGAAG CT ACT C C T T CAC CAGT T T T GC CT GACAT T GTT AT G 1848 

2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCT GGT GCTT CTGTAGTGCAGCCCAGTGTA 2 091 

I M I I M M II I M I I I I I II II I I II II II I II I I I I II II I I I I 

184 9 GAAGC AC CAT T GAAT T CT G CAGT T C CT AGT GC T G GT GCT T CC GT GAT ACAGC C C AGCT C A 190 8 

2 092 T C CCCACT GGAAGCAC CT C CT C CAGT T AGT TAT GACAGTATAAAGCTT GAGCCT GAAAAC 2151 

II IN I I I M M II I I I I M II II I I II Mill I I I I I II II II I I I 
1909 T CAC CAT T AGAAG CT T CTT C AGT T AAT TAT GAAAG CAT AAAAC AT GAGC C T GAAAAC 1965 

2152 CCCCCACCATATGAAGAAGCCATGAATGTAGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I M I I I II I I I I I M II I I I II I I I I I II I || II I I II II II II I I | 
1966 C C C C CAC CAT AT GAAGAG GC C AT GAGT GT AT CAC T AAAAAAAGT AT CAGGAATAAAGGAA 2025 

2209 G GAATAAAAGAG C CT GAAAGTT T T AAT G C AGC T GTT C AGGAAACAGAAGCT C CTT AT AT A 2268 

I I I I I I I I I I M I M I I I I II II I I II II MM II II II I II I II II I I I II II 
202 6 GAAAT T AAAGAG C CT GAAAAT AT T AAT G CAGC T C T T CAAGAAAC AGAAGCT C CT T AT AT A 2085 

22 69 T C CAT T G C GT GT GAT T T AAT TAAAGAAACAAAG C T CT C CACT GAGC CAAGT C C AGAT TT C 2328 

I I I M II II I I I I M I M I I I I I M II I I II II II M I I I II M I II I M I 

208 6 T C TAT T G CAT GT GAT T T AAT TAAAGAAACAAAG CTT T CT GCT GAAC C AGCT C C GGAT T T C 2145 

232 9 T C TAAT TAT T C AGAAAT AGCAAAAT T C GAGAAGT CGGTGCCC GAAC AC GCT GAGCT AGT G 2388 

I I I M I M I I II I I I I II I I I I I M II I II I I I | | || | | | | M II I I 
214 6 T C T GAT TAT T CAGAAAT GGCAAAAGT T GAAC AG C C AGT GC CT GAT CAT T CT GAGC T AGT T 22 05 

238 9 GAG GATT C CT CAC CT GAAT CT GAAC C AGT T GACT TAT T TAGT GAT GAT T C GAT T C C T GAA 24 4 8 

N I M I II I II I I I I I II I I I I I I I I II I I || M I I I I II I II I || II II I II I I 
2206 GAAGATT CCT CACCTGATT CT GAACCAGTT GACT TAT T TAGT GAT GATT C AAT AC CT GAC 2265 

24 49 GT C C C ACAAACACAAGAGGAG G CT GT GAT GCT CAT GAAGGAGAGT CT CACT GA A 25 02 

I I I I I M II II I I I I II | M II II II I I I I | M I II I II M I I I I 
22 66 GT T C C ACAAAAACAAGAT GAAACT GT GAT GCT T GT GAAAGAAAGT CT C ACT GAGAC T T C A 2325 

2503 GT GT C T GAGAC AGT AG C C CAGC ACAAAGAGGAGAGACT TAGT GCCT CAC CT C AGGAGC T A 2562 

I I Ml I IN M I I I I || | || | | | || 

2 32 6 T T T GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT G C TT T GC C AC C T G AGG GA 238 5 

2563 G GAAAGC CAT AT T TAG AGT C T T T T C AG C C CAAT TTACAT AGT ACAAAAGAT G C T G C A 2619 

I I I I I M II I M I I I I I M II I M I I I | | || || || | | 

2386 G GAAAGC CAT AT T T GGAAT C T T T TAAGC T CAGT T T AGAT AAC ACAAAAGAT AC C CT GT T A 2445 



Qy 


2620 


T CTAAT GACAT T C C AAC AT T GAC C AAAAAG GAGAAAAT TT C T T T GC AAAT GGAAGAGT T T 

II MM M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | I I | | I I | | | | | | Mi | 

C CT GAT GAAGT T T C AAC AT T GAGCAAAAAG GAGAAAAT T C C T T T GC AGAT GGAGGAG CT C 


2679 


Db 


2446 


2505 


Qy 


2680 


AAT AC T GCAAT T TAT T CAAAT GAT GAC TT ACT T T C T T C T AAGGAAGACAAAATAAAAGAA 2 739 
1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II II 1 II 1 1 1 1 M 1 1 1 II 1 1 I 1 1 1 
AGTACT GCAGTTTATT CAAAT GAT GACTTATTTATTT CTAAGGAAGCACAGATAAGAGAA 2565 


Db 


2506 


Qy 


2740 


AGT GAAAC AT T T T C AGAT T CAT C T C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T TGT C 

i HUM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 ii ii ii ii 1 1 

AC T GAAAC GT T T T C AGAT T CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C CT ACATT GAT C 


2799 


Db 


2566 


2625 


Qy 


2800 


AGT GCTAAAGAT GAT T C T C C T AAAT TAGC CAAG GAGT ACACT GAT CT AGAAGTAT CC 

1 1 1 M 1 1 1 Mill, 1 | | || | | | | | | | | | || || | || | | | | || | | | | | | || 

AGT T CT AAAAC T GAT T CAT T T T CT AAAT TAG C CAGGGAAT AT ACT GAC CT AGAAGTAT CC 


2856 


Db 


2626 


2685 


Qy 


2857 


GAC AAAAGT GAAAT T G CTAAT AT C CAAAG C GGGGCAGAT T C ATT GC CTT G CT TAGAAT T G 

INI MIIIMMII II II II 1 MINIMUM IMIIM 

CAC AAAAGT GAAAT T G CTAAT GC C C C G GAT GGAGC T G GGT C ATT GC CT T GCAC AGAAT T G 


2916 


Db 


2686 


2745 


Qy 


2917 


C C CT GT GAC CTT T C T T T CAAGAAT AT AT AT C C T AAAGAT GAAG T AC AT GT TT C A 

Ml 1 1 1 1 1 M M 1 II 1 1 II 1 1 II 1 1 1 II M 1 II 1 1 1 MIM 

C C C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGTT T CT C A 


2970 


Db 


2746 


2805 


Qy 


2971 


GAT GAAT T CT CC GAAAAT AGGT C C AGT GT AT C T AAGG CATC CAT AT C GC CT T CAAAT GT C 

1 1 1 1 1 1 1 II II 1 II 1 1 1 1 1 IMIIM 1 1 1 1 1 1 1 1 M II 1 1 

GAT GAC T T T T CTAAAAAT G GGT CT GC T AC AT CAAAG GT GCT CTT AT T GC CT C C AGAT GT T 


3030 


Db 


2806 


2865 


Qy 


3031 


TCTGCTTT GGAACCT CAGAC AGAAAT GG G CAG CAT AGT T AAAT CCAAAT C ACT TAC GAAA 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II II 1 1 1 II M 1 II 1 1 1 1 II 1 1 Ml MM 

TCTGCTTTGGC CACT CAAGC AGAGAT AGAGAG CAT AGT T AAAC CCAAAGT T CT T GT GAAA 


3090 


Db 


2866 


2925 


Qy 


3091 


GAAG C AGAGAAAAAACTT C C T T CT GAC ACAGAGAAAGAGGAC AGAT C C C T GT C AGC T GT A 

MIM 1 1 1 1 II 1 II 1 1 II MM 1 1 1 1 II II 1 II 1 M 

GAAG C T GAGAAAAAAC TTCCTTCC GAT ACAGAAAAAGAGGAC AGAT CAC CAT C T GC T AT A 


3150 


Db 


2926 


2985 


Qy 


3151 


T T GT CAGC AGAGC T GAGT AAAACT T C AGT T GT T GAC CT C CT C TAC T G GAGAGACAT TAAG 

II 1 1 i 1 1 M 1 1 1 1 II II II II II II 1 1 1 1 1 II 1 1 1 1 1 1 | | 1 II M II M 1 1 1 1 1 1 1 1 1 

T T T T CAG CAGAG C T GAGT AAAACT T C AGT T GT T GAC CT C CT GT ACT G GAGAGACAT TAAG 


3210 


Db 


2986 


3045 


Qy 


3211 


AAGACT GGAGT G GT GT T T GGT GC CAGCT TAT TCCT GCT GCT GTCTCTGACAGTGTT CAGC 

1 M II 1 1 M 1 M 1 1 1 1 II 1 II 1 1 1 II 1 1 1 1 1 1 M 1 1 1 II 1 II IMIIM MUM 

AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 


3270 


Db 


3046 


3105 


Qy 

Db 


3271 
3106 


ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 
1 1 N 1 M 1 1 1 1 1 1 II 1 1 1 1 1 II 1 || M 1 1 II 1 II M 1 1 1 1 II 1 1 II 1 1 I 1 1 II 1 1 
ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 


3330 
3165 


Qy 


3331 


AT ATAT AAG GGC GT GAT C C AGGCT AT C CAGAAAT C AGAT GAAGGC CAC C CAT T CAG GG C A 

HIM INN II 1 II II 1 1 1 1 II II II II 1 1 II II | | || | | | M II II 1 1 II 1 1 1 II 

AT ATACAAGGGT GT GAT C CAAGCT AT C CAGAAAT CAGAT GAAGGCCACC CATT CAGGGCA 


3390 


Db 


3166 


3225 


Qy 


3391 


TAT T TAGAAT C T GAAGT T G C TAT AT C AGAGGAAT T GGT T CAGAAAT ACAGTAATT CT GCT 

HI 1 1 1 1 1 1 1 II II 1 1 M 1 1 1 1 M III II M II II II II 1 II 1 II II 1 

TAT C T G GAAT C T GAAGT T GCT AT AT C T GAGGAGT T GGT T C AGAAGT AC AGT AATT CT GCT 


3450 


Db 


3226 


3285 


Qy 


3451 


CT T G GT CAT GT GAAC AG C ACAAT AAAAGAAC T GAGGC GG C T T T T CT T AGT T GAT GAT TT A 


3510 





Db 



3286 C TT GGT CAT GT GAAC T GC AC GAT AAAGGAACT CAGGC G C C T C T T C TT AGT T GAT GAT T T A 3345 



Qy 



3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 





Db 



3346 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 34 05 



Qy 



3571 AAT G GT C T GACACT ACT GAT TT T AG CT CT GAT CT CAC T C T T CAGT AT T C CT GT TAT T TAT 3630 





Db 



3406 AAT GGT CTGAC ACT ACT GAT TT T GGCT CT CAT T T C ACT CTT CAGT GT T CCT GT TAT T TAT 3465 



Qy 



Db 




Qy 



3691 GC C AT GGC CAAAAT C CAAGCAAAAAT CC CT G GAT T GAAGC GCAAAGC AGA 37 4 0 



Db 



3526 GCT AT G GCTAAAAT C C AAG CAAAAAT CC C T GGATT GAAGCG CAAAGCT GA 3575 



RESULT 12 
ABN86601 

ID ABN86601 standard; DNA; 3579 BP. 
XX 

AC ABN86601; 
XX 

DT 05-NOV-2002 (first entry) 
XX 

DE Human neurotransmitter receptor protein Nogo encoding DNA. 



KW Nerve regeneration; neuroprotection; neuronal degeneration; CNS; PNS; 

KW central nervous system; peripheral nervous system; tranquillizer; Nogo; 

KW vulnerary; cerebroprotective; anti-tumour; antidiabetic; anticonvulsant; 

KW nootropic; antiparkinsonian; ophthalmological; analgesic; hepatotropic; 

KW osteopathic; vasotropic; nephrotropic; cytostatic; antigen; gene therapy; 

KW neurotransmitter receptor; human; gene; ds . 
XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT CDS 1. .3579 

FT /*tag= a 

FT /product= "Nogo" 

FT /note= "Nogo-A, Nogo-B and Nogo-C" 

XX 

PN US2002072493-A1. 
XX 

PD 13-JUN-2002. 
XX 

PF 28-JUN-2Q01; 2001US-00893348 . 
XX 

PR 19-MAY-1998; 98IL-00124500 . 

PR 21-JUL-1998; 98WO-US014715 . 

PR 22-DEC-1998; 98US-00218277 . 

PR 19-MAY-1999; 99US-00314 161 . 
XX 

PA (YEDA ) YEDA RES & DEV CO LTD. 



XX 



XX 

PI Eisenbach-Schwartz M, Hauben E, Cohen IR, Beserman P, Mosonego A; 

PI Moalem G; 

XX 

DR WPI; 2002-607255/65. 

DR P-PSDB; ABB81078, ABB81079, ABB81080. 
XX 

PT Promoting nerve regeneration and preventing neuronal degeneration in the 

PT central/peripheral nervous system from injury/disease, comprises 

PT administering nervous system-specific activated T cells/antigen, or 

PT analogs/peptides . 

XX 

PS Disclosure; Page 49-53; 93pp; English. 
XX 

CC The invention relates to promoting nerve regeneration or conferring 

CC neuroprotection and preventing or inhibiting neuronal degeneration in the 

CC central/peripheral nervous system (NS) . The method involves administering 

CC NS-specific activated T cells, NS-specific antigen, its analogue or its 

CC peptide, a nucleotide sequence the NS-specific antigen or its analogue or 

CC combinations. The method is useful for promoting nerve regeneration and 

CC preventing neuronal degeneration in central/peripheral nervous system 

CC from injury/disease, where the injury is spinal cord injury, blunt 

CC trauma, penetrating trauma, hemorrhagic stroke, ischaemic stroke or 

CC damages caused by surgery such as tumour excision. The disease is not an 

CC autoimmune disease or neoplasm. The disease results in a degenerative 

CC process occurring in either gray or white matter or both. The disease is 

CC diabetic neuropathy, senile dementia, Alzheimer's disease, Parkinson 1 s 

CC disease, facial nerve (Bell's) palsy, glaucoma, Huntington's chorea, 

CC amyotrophic lateral sclerosis, non-arteritic optic neuropathy, and 

CC vitamin deficiency, intervertebral disc herniation, prion diseases such 

CC as Creutzfeldt- Jakob disease, carpal tunnel syndrome, peripheral 

CC neuropathies associated with various diseases, including but not limited 

CC to uremia, porphyria, hypoglycemia, Sjorgren Larsson syndrome, acute 

CC sensory neuropathy, chronic ataxic neuropathy, biliary cirrhosis, primary 

CC amyloidosis, obstructive lung diseases, acromegaly, malabsorption 

CC syndromes, polycythemia vera, immunoglobulin (Ig)A- and IgG gamma - 

CC pathies, complications of various drugs (e.g., metronidazole) and toxins 

CC (e.g., alcohol or organophosphates) , Charcot-Marie-Tooth disease, ataxia 

CC telangectasia, Friedreich's ataxia, amyloid polyneuropathies, 

CC adrenomyeloneuropathy, Giant axonal neuropathy, Ref sum's disease, Fabry's 

CC disease, or lipoproteinemia . The present sequence represents a DNA 

CC encoding the human neurotransmitter receptor protein Nogo (Nogo-A, Nogo-B 

CC and Nogo-C) , an example of NS-specific antigen 

XX 

SQ Sequence 3579 BP; 1074 A; 803 C; 812 G; 890 T; 0 U; 0 Other; 

Query Match 61.2%; Score 2289.2; DB 6; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I II I I I I I II II I I i I I I I I I I I II I I I I I I I I I I I I I I I I I I I M I 
Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

Qy 313 CCGCCCGCCTT C AAG T AC C AGT T C GT G AC G GAG C C C GAG G AC GAG GAG G AC GAG GAG GAG 372 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II M I 
Db 58 C AGC CCG C GT T CAAGT ACCAGT T CGT GAG G G AGCCC G AG GAC GAGGAG GAAGAAGAG 114 



Qy 373 GAGGAG GAC GAGGAGGAG GAC GAC GAGGAC CT AGAG GAACT G GAG GT GCT GGAGAGGAAG 432 

I I I I I I I I I I I I II I I II I I I I I MIM I I I I I I I I II I I I I I I I I I I I I M II 
Db 115 GAGGAGGAAGAGGAGGACGAGGACGAAGACCT GGAGGAGCTGGAGGTGCT GGAGAGGAAG 174 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

I I I M I I M I | | | | | | | | | in 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 
I I I N I I I I I I I I I I | | | | | | | | || || | | | | | | | | | | | ( | || || | | | 
Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 54 7 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I II I I I I I I I I I I I I || || | | | | | | | | | | I I I I I I 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I II M I I I I I I I I I I I || IN | | 

Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I I I I I I I I II I | | | | || | | IN M I I I I I I I I I II I Ml 
Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I M I M I I I II I I Mill I I I I II I II I I M I I M I 

Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I I I I I I I I I I I I I I M I I I I I I M II I I II I I 11 II II I I I II I M II I I II I M I 

Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 808 GT GAT AC CCTCCTCTG C AGAAAAAAT TAT GGAT T T GAT GGAG C AGC C AGGTAAC ACTGTT 867 

mini: I M 1 1 1 1 1 ii i ii 1 1 ii ii 1 1 mm M M 1 1 1 ii I M M 1 1 m i ii 

Db 595 GT GAT AC GCT C CT CT G C AGAAAA TAT GGAC T T GAAGGAG C AG C C AGGT AACACT AT T 651 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

HI I I I M I I I I I I I M II I M I I I I I I I I I I I II I I I M I II II I I II I I I I 

Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 928 C TAT CTCCTCTCTCAACTGTTTCTTTTAAAGAACAT GGAT ACCTTGGTAACTTATCAGCA 987 

II I I I I I I I I I M I I I I I I I I I I II II I I I II I II I II I I I I II II Ml II 

^ b 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 988 GT GT CAT CCT CAGAAGGAACAATT GAAGAAACTTTAAAT GAAGCTT CTAAAGAGT T GC CA 1047 

M I I II I I I I M I I II II I I I II I I I I I I | | || I I II II II I I I II 
D b 772 GT AT TACCCACT GAAGGAACACTT CAAGAAAAT GT C AGT GAAGCTT CTAAAGAGGT CT CA 831 

Qy 1048 GAGAGGGCAAC AAAT C CAT T T GTAAAT AGAGAT TT AGCAGAAT T T T CAGAAT T AGAAT AT 1107 

I I I I I I I I I M M I I M I II I I II II II I II I I I I I M I I I I I II I I I I 
Db 832 GAG AAG G C AAAAACT C TACT CAT AG AT AGAG AT T T AAC AG AGT T T T CAGAAT T AGAAT AC 891 

Qy H08 T CAGAAAT G GGAT C AT CT T T T AAAGGC T C C C CAAAAG GAGAGT CAGC C AT AT T AGTAGAA 1167 

• I I I I I I II II II II II II I I I M Ml I M II I II I I I I I I I 

Db 892 T CAGAAAT G GGAT CAT C GT T CAGT GT CT C T C CAAAAG CAGAAT C T GC C GT AAT AGTAGCA 951 



1168 AACAC TAAGGAAGAAGT AAT T GT GAGGAGT AAA GACAAAGAGGAT TTAGT TTGTAGT 1224 

II I I I I I M I I I I I I I I I I I I I I I I || | | | | | | | I I I I I I | I | 
952 AAT C CT AG G GAAGAAATAAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGTAAT 1011 

1225 G C AG C C C T T C AC AGT C C AC AAGAAT C AC C T GTGGGTAAAGAAGAC 1269 

I I I I I I I I I I M I I I I I I I I I II I I I I I I I I 

1012 AACAT C CT T CAT AAT C AACAAGAGT T AC CT AC AG CT CT T AC T AAAT T G GTT AAAGAGGAT 1071 

1270 AGAGTT GT GT CT C C AGAAAAGACAAT GGAC AT T T T TAAT GAAAT G CAGAT GT C AGT AGT A 1329 

I I I I I M I I I I I I I I I I Ml I II I I I I I I | | | | | | | | | | | I I I 

1072 GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTTGCAGT GGAA 1131 

1330 GCAC CT GT GAG GGAAGAGT AT GC AGAC T T T AAGC CAT TT GAACAAG CAT GGGAAGT GAAA 1389 

N I I I M M III I I | | | | | M | | || | | | | | | | || 

1132 G CT C CT AT GAG GGAG GAAT AT GC AGAC T T CAAAC C AT TT GAG C GAGT AT G GGAAGT GAAA 1191 

1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

MM I I I I I I I I I I I I I I I I I I II I I I I I || | | 

1192 GAT A GT AAGGAAGAT AGT GAT AT GT TGGCTGCTG GAG GT AAAAT C GAGAG CAAC T T G 1248 

1438 GAAAGTAAAGT GGAC AGAAAAT GCT T GGAAGAT AGC C T G GAG CAAAAAAGT CTT G GGAAG 1497 

I I I I I I I M II I I I I I I I I I | | | | | | | | | | | | | | | | || | | | | | | || 
124 9 GAAAGTAAAGT G GAT AAAAAAT GT T T T G CAGAT AGC CT T GAG CAAACTAAT C AC GAAAAA 1308 

1498 GAT AGT GAAG GCAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGAC 1557 

I I I I I I I I I M I I I I I III I I I I I I I I I I I | | | M | | M II I I I I I I I 
1309 GAT AGT GAGAGT AGT AAT GAT GAT AC TTCTTTCCC CAGT AC G C CAGAAGGT ATAAAGGAT 1368 

1558 AGCT C CAGAGC AT AT ATT AC CT GT GCTTCCTT T AC CT C AGCAACC GAAAG C AC C AC A 1614 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I || 

1369 C GT C CAGGAGC AT AT AT C AC AT GT GCTCCCTT T AAC C CAGCAG C AACT GAGAG CAT T G C A 1428 

1615 GCAAACACTTT CCCTTT GTT AGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAA 1674 

I I I I I I Ml I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

142 9 AC AAAC AT TTTTCCTTTGT TAG G AGAT C C T AC T T C AGAAAAT AAGAC C GAT GAAAAAAAA 148 8 

167 5 AT AGAAGAAAG GAAG GC C C AAAT T AT AAC AGAG AAG AC TAG C C C C AAAAC GT CAAAT 1731 

I I I I I I II I I I I II I I I | | | | I I I I I I I I I I I I I I I I 

148 9 AT AGAAGAAAAGAAG GC C CAAAT AGT AAC AGAGAAGAAT ACT AGCAC CAAAAC AT CAAAC 1548 

1732 CCTTTCCTT GTAGC AGT ACAG GAT T CT GAGG CAGAT TAT GT T ACAACAGAT AC CT TAT C A 17 91 

I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II I | || | | | | | | 
1549 C CTTT T CTT GTAGCAGCACAGGATT CT GAGACAGATTAT GT C AC AAC AGAT AAT TT AAC A 1608 

1792 AAGGT GAC T GAG G C AG CAGT GT CAAAC AT GC C T GAAGGT CT GAC GC CAGAT T T AGTT C AG 1851 

I N I I I I I I I I I I || Ml I I I I I I I I I I I M I I I I I I I I I I II I I I I I I I III 
1609 AAGGT GAC T GAG GAAGT C GT G GCAAAC AT GC C T GAAGGC CT GACT C CAGAT T T AGT ACAG 1668 

1852 G AAGC AT GT GAAAGT GAACT GAAT GAAG C C AC AG GTACAAAGATT GC T TAT GAAACAAAA 1911 

I I I I I I I I I I I I I M M I I I I I I I I I I I II I I II 

1669 GAAGC AT GT GAAAGT GAAT T GAAT GAAGT TACT G GTACAAAGATT GC T TAT GAAACAAAA 1728 

1912 GTGGACTTGGTCCAAACATCAGAAGCTATACAAGAATCACTTTACCCCACAGCACAGCTT 1971 

I I I I I N I I I I I I I II I | I I | | I I Ml II I I I I I I I I 

1729 AT G GAC T T G GT T CAAACAT CAGAAGT TAT G CAAGAGT CACT C TAT C CT G CAGCAC AGCT T 178 8 

1972 T GC C CAT CAT T T GAG GAAG CT GAAGC AACT C CGT CAC CAGT T T T G C CT GAT AT T GT TAT G 2031 



Db 


1789 


TGCCCATCATTTGAAGAGTCAGAAGCTACTCCTTCACCAGTTTTGCCTGACATTGTTATG 184 8 


Qy 


2032 


GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2 091 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | 
GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 1908 


Db 


1849 


Qy 


2092 


T C C C CAC T G GAAGC AC C T C CT C C AGTT AGT T AT GACAGT ATAAAGCTT GAGC CT GAAAAC 2151 

1 Ml 1 1 1 1 1 II II MINI 1 1 1 1 II II 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 

T CAC CAT T AG AAG CTTCTTCAGTTAATTATGAAAGCATAAAACAT GAGC CT GAAAAC 1965 


Db 


1909 


Qy 


2152 


CC CCCAC CAT AT GAAGAAGCCAT GAAT GTAGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 MM 1 II 1 Mill 1 MM 1 II 1 M 1 

CCCC CAC CAT AT GAAGAGGC CAT GAGT GTAT CACTAAAAAAAGTAT C AGGAATAAAGGAA 2025 


Db 


1966 


Qy 


2209 


G GAATAAAAGAGC CT GAAAGT T T T AAT GC AGCT GT T CAG GAAACAGAAG C T C CT T AT AT A 2268 

1 IN 1 M 1 M 1 II 1 II 1 1 1 II 1 1 II II 1 1 MM II 1 II 1 1 1 M II 1 1 M 1 1 1 II 

GAAATT AAAGAGC CT GAAAAT AT TAAT GC AGCT CT T CAAGAAAC AGAAG C T C CT T AT AT A 2085 


Db 


2026 


Qy 


2269 


T C CATT GC GT GT GAT T T AAT T AAAGAAACAAAGCT C T C CAC T GAGC CAAGT C C AGATTT C 2328 

M M II 1 II II II 1 II II 1 II II II 1 II 1 1 1 II II II II Ml II 1 II 1 1 1 1 

T CT ATT GC AT GT GAT T TAAT TAAAGAAACAAAGCT T T C T G CT GAAC C AGCT C C GGATT T C 2145 


Db 


2086 


Qy 


2329 


T C TAAT T ATT C AGAAAT AGCAAAAT T C GAGAAGT C G GT GC C C GAAC AC GCT GAG C TAGT G 238 8 

1 1 1 M M 1 II 1 1 II II II II 1 1 1 1 1 II 1 II 1 II II 1 1 1 II M 1 II 1 1 

T C T GAT TAT T C AGAAAT GG CAAAAGT T GAAC AGCCAGT G C CT GAT CATT CT GAGCT AGT T 2205 


Db 


2146 


Qy 


2389 


GAGGATTC CT CACCT GAAT CTGAACCAGTT GACTT ATTTAGT GAT GAT T CGATT CCT GAA 244 8 

M II II II II II II 1 1 1 II II 1 1 1 1 II 1 II II 1 II II 1 II II 1 II II 1 M II 1 M 

GAAGATT CCT CACCT GATT CT GAACCAGTT GACTTATTTAGT GAT GATT CAATAC CTGAC 2265 


Db 


2206 


Qy 


2449 


GT C C CACAAACACAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT C T C ACT GA A 2 5 02 

M M II II 1 1 II 1 II II II 1 1 II II 1 1 II 1 1 II 1 1 1 II 1 1 II M 1 

GTT CCACAAAAACAAGAT GAAACT GT GAT GCTT GT GAAAGAAAGT CT CACT GAGACTT CA 2325 


Db 


2266 


Qy 


2503 


GT GT C T GAGAC AGT AG C C CAG C ACAAAGAGGAGAGAC T T AGT GC CT C AC C T C AGGAGCT A 2562 

1 1 M 1 1 III 1 1 II 1 1 II 1 II 1 1 1 Ml III 1 

T T T GAGT CAAT GAT AGAAT AT GAAAAT AAGGAAAAACT CAGT GCT T T GC C AC CT GAGGGA 2385 


Db 


2326 


Qy 


2563 


GGAAAGCCAT AT TTAGAGT CT TTT CAGC C CAATTTACATAGTACAAAAGAT GC TGCA 2619 

M 1 1 II 1 II II 1 1 1 II II M 1 1 II 1 II 1 II M 1 1 1 II 1 1 

GGAAAGC C AT AT T T GGAAT CT T T TAAGC T CAGT T T AGAT AAC AC AAAAGAT AC C CT GT T A 2445 


Db 


2386 


Qy 


2620 


T CT AAT GACAT T C CAACAT T GAC CAAAAAGGAGAAAAT T T C T T T G CAAAT G GAAGAGT T T 2679 

II MM II II 1 II II 1 1 II 1 1 1 M II 1 1 II II 1 II II II 1 Mill II 1 1 

CCT GAT GAAGT T T CAACATT GAG CAAAAAGGAGAAAAT T CCTTT GC AGAT GG AGGAGC T C 2505 


Db 


2446 


Qy 


2680 


AAT ACT GCAATT TATT CAAAT GAT GACTTACTTT CTT CTAAGGAAGACAAAATAAAAGAA 2739 
1 1 1 M 1 M 1 1 II II II 1 1 II II 1 1 1 1 1 1 II II 1 1 M 1 II 1 1 1 1 1 II 1 1 II 
AGTACT GCAGTTTATT CAAAT GAT GACT T ATTTATTT CTAAGGAAGCACAGATAAGAGAA 2565 


Db 


2506 


Qy 

Db 


2740 
2566 


AGT GAAACAT TTT C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAATT T C C CAC GT T T GT C 2799 

1 M II II 1 1 II 1 II 1 1 1 1 1 II II 1 II II 1 II II 

ACT GAAACGTT TT CAGATTCAT CT CCAATT GAAAT TAT AGAT GAGTT CCCTACATT GAT C 2625 


Qy 


2800 


AG T G C T AAAG AT GAT T C T C CTAAAT T AGC CAAGGAGTAC AC T GAT C T AGAAGT AT C C 2856 



III 1 1 1 1 1 1 1 II 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 III II 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 



2626 AGTT C TAAAAC T GAT T CAT T T T C TAAAT T AGC CAG G GAATAT AC T GAC CT AGAAGT AT C C 2685 

2857 GAC AAAAGT GAAAT T G CT AAT AT C CAAAG C GGGGCAGAT T CAT TGCCTTGCT T AGAAT T G 2 916 

N M I I I I I I I I I I II | | | | | | | | | | | | | | | M I I I | I I I 

268 6 CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 274 5 

2 917 C C CT GT GAC CTTTCTTT C AAGAAT AT AT AT C C TAAAGAT GAAG TACATGTTTCA 2 970 

Ml I I I I I I I I I I I I M I I I || | I | | | | 

2 746 C C C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T C T C A 2 805 

2971 GAT GAAT T CT C C GAAAAT AGGT C CAGT GTAT CT AAGGC AT C CAT AT C G C CTT CAAAT GT C 3030 

I N I I I I M M I II I I I I I I I I | | | | | | || || || | | | | | | 

28 06 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2 8 65 

3 031 T C T G CT T T GGAAC CT CAGAC AGAAAT G GG C AGC AT AGT TAAAT C CAAAT C AC T T AC GAAA 3090 

IN I I I I I I I I I I I I I I I I I I M IN | | M 

2866 TCTGCTTTGGC C ACT CAAGC AG AGAT AGAGAG CAT AGT TAAAC C CAAAGT T CT T GT GAAA 2925 

3091 GAAGC AGAGAAAAAACTT C CT T CT GAC ACAGAGAAAGAGGAC AGAT C C CT GT CAGC T GT A 3150 

INN II I I I I I I I I I I I M II II Mill I II I I I I I I || | || | M III II 
2926 GAAGC T GAGAAAAAAC TTCCTTCC GAT ACAGAAAAAGAGGAC AGAT C AC C AT CT G CT AT A 2985 

3151 TT GT C AGCAGAGCT GAGTAAAAC T T CAGT T GT T GAC CT C CT C T AC T GGAGAGAC AT T AAG 3210 

N I I M M II II I I I I I I I I I I I I | | | | | | M M I II II I I I MINI 

298 6 TT TT CAG C AGAGCT GAGTAAAAC T T CAGT T GT T GACCT C C T GT AC T GGAGAGAC AT T AAG 3045 

3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I I I I I I I I I I I I I I I I I M I I II II II I I II I I I || | | | | || | | | | | || Mill: 
304 6 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGT ATT CAGC 3105 

3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I M I I I I I I I I I I M II II II II I I I I I I I I I I I Mill II II I II II I M 
3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

3331 AT AT AT AAG G G C GT GAT C CAGG CT AT C C AGAAAT CAGAT GAAG GC CACC CAT T C AG GGC A 3390 

HI M I I I M I II I I I I II I II II I II II I M I Mill 

3166 AT AT ACAAG GGT GT GAT C CAAG CT AT C C AGAAAT CAGAT GAAG G C CACC CAT T C AGGGC A 3225 

33 91 TAT T T AGAAT CT GAAGT T GCT AT AT C AGAGGAAT T GGT T C AGAAAT ACAGTAATT C TGCT 34 50 

HI I I I M M II II I II II II I I I M I I M II I M M II I II I II II I II 

3226 TAT C T GGAAT CT GAAGT T GCT AT AT C T GAGGAGT T GGT T C AGAAGT AC AGTAATT C TGC T 3285 

34 51 CTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTA 3510 

I I I I I I I I I I I I I I I I M I M M I II I II Mill M M II II II II I M I I II I 
3286 CT T GGT CAT GT GAAC T GCAC GAT AAAGGAAC T CAGGC GC CT C T T CT T AGTT GAT GATT T A 3345 

3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I 1 I I I I I I M M I II II II I II I II II I || M I II || || I || || || II II I I 
3346 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 34 05 

3571 AAT GGT CTGACACTACTGATTTTAGCTCT GAT CTCACT CTT CAGT ATT CCTGTTATTTAT 3630 

I I I I H M M M M M MMI II II I M I II I II I I I II II II M I I M 

34 06 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 34 65 

3631 GAAC GGCAT C AGGT G CAGAT AGAT CAT TAT CT AGGACT T GCAAACAAGAGT GT T AAG GAT 3690 

I I I I I I I I I I I I I M M M II II II II II II I II I II I I II II II II II M II II I 
34 66 GAAC G GC AT CAGGC GC AGAT AGAT CAT TAT CT AGGACT T GCAAAT AAGAAT GT TAAAGAT 3525 



Qy 3691 G C CAT G G CCAAAAT C CAAG C AAAAAT C C CT GGAT T GAAG C GC AAAGC AGA 3740 

II Mill I I I I I I I I I II I I I I I I I I I I M I I I | | | || | | | | | | | || 
Db 3526 G C TAT G G CT AAAAT C CAAG CAAAAATC C CT GGAT T GAAG C GCAAAGC T GA 3575 

RESULT 13 
AAD01174 

ID AAD01174 standard; cDNA; 3833 BP. 
XX 

AC AAD01174; 
XX 

DT 02-NOV-2000 (first entry) 
XX 

DE Bovine neurite growth inhibitor Nogo cDNA. 
XX 

KW Bovine; neurite growth inhibitor; Nogo; neural cell; myelin; CNS; 

KW central nervous system; neoplastic disease; antiproliferative; glioma; 

KW antisense gene therapy; neuroblastoma; menagioma; retinoblastoma; 

KW degenerative nerve disease; Alzheimer's disease; Parkinson's disease; 

KW hyperproliferative disorder; benign dysprolif erative disorder; diagnosis 

KW psoriasis; tissue hypertrophy; neuronal regeneration; treatment; 

KW structural plasticity; screening; ss. 

XX 

OS Bos sp. 
XX 

PN WO200031235-A2. 
XX 

PD 02-JUN-2000. 
XX 

PF 05-NOV-1999; 99WO-US02 6160 . 
XX 

PR 06-NOV-1998; 98US- 010744 6P . 
XX 

PA (SCHW/) SCHWAB M E. 

PA (CHEN/) CHEN M S. 
XX 

PI Schwab ME, Chen MS; 
XX 

DR WPI; 2000-400052/34. 
XX 

PT Nogo proteins and nucleic acids useful for treating neoplastic disorders 

PT of the central nervous system and inducing regeneration of neurons. 

XX 

PS Claim 26; Fig 12; 122pp; English. 
XX 

CC The present sequence is a cDNA encoding bovine Nogo protein which is a 

CC potent neural cell growth inhibitor and is free of all central nervous 

CC system (CNS) myelin material with which it is natively associated. The 

CC present sequence was obtained from bovine spinal cord white matter cDNA 

CC library. Nogo proteins and fragments displaying neurite growth inhibitory 

CC activity are used in the treatment of neoplastic disease of the CNS e.g. 

CC glioma, glioblastoma, medulloblastoma, craniopharyngioma, ependyoma, 

CC pinealoma, haemangioblastoma, acoustic neuroma, oligodendroglioma, 

CC menagioma, neuroblastoma or retinoblastoma and degenerative nerve 

CC diseases e.g. Alzheimer's and Parkinson's diseases. Therapeutics which 

CC promote Nogo activity can be used to treat or prevent hyperproliferative 



CC or benign dysprolif erative disorders e.g. psoriasis and tissue 

CC hypertrophy. Ribozymes or antisense Nogo nucleic acids can be used to 

CC inhibit production of Nogo protein to induce regeneration of neurons or 

CC to promote structural plasticity of the CNS in disorders where neurite 

CC growth, regeneration or maintenance are deficient or desired. The animal 

CC models can be used in diagnostic and screening methods for predisposition 

CC to disorders and to screen for or test molecules which can treat or 

CC prevent disorders or diseases of the CNS. Note: SEQ ID numbers 35-42 are 

CC referred in claim 32 and SEQ ID NO: 29 in disclosure of the 

CC specification. However the specification does not include sequences for 

CC these SEQ ID numbers 

XX 

SQ Sequence 3833 BP; 1235 A; 717 C; 818 G; 1063 T; 0 U; 0 Other; 

Query Match 50.0%; Score 1869.8; DB 3; Length 3833; 

Best Local Similarity 80.9%; Pred. No. 0; 

Matches 2320; Conservative 0; Mismatches 492; Indels 55; Gaps 10; 

Qy 92 8 CTAT CT C CT CT CT CAACT GTTT CTTTTAAAGAACATGGATACCTT GGTAACTTAT CAGCA 987 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II I I I I II I I II I III Mill 

Db 1 CTAT CT CCTCTCTCAGCCGCTGCTTTTAAAGAACGTGAATACCTTGGT GAT TTAC CAGCA 60 

Qy 988 GT GT CAT CCT CAGAAGGAACAATT GAAGAAACTTT AAAT GAAGCTT CT AAAGAGTT GCCA 1047 

II I I I I I I I II I I I I I I I M I I I I I M I II I I I I I M I I I II II 

Db 61 GT ACT GC CCACT GAAGGAACACT T C CAGCAACT T CAAAT GAAGCTT CTAAAGC ATT CT C A 120 

Qy 1048 GAGAGGGCAACAAAT CCATTT GTAAATAGAGATTTAGCAGAATTTT CAGAATTAGAAT AT 1107 

I I II I II I I I I I I I I I I II I I I I III I I I II I I I I I II I I I I I I I I I I I I I I I 
Db 121 GAGAAGGCAAAAAAT CC AT T T GT AGAGAGAAATT T AAC AGAAT T TT CAGAAT T GGAAT AT 180 

Qy 1108 TCAGAAATGGGATCATCTTTTAAAGGCTCCCCAAAAGGAGAGTCAGCCATATTAGTAGAA 1167 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 181 T CAGAAAT G GAAT CAT CAT T C AGT GGCT CT CAAAAGGC AGAAC CT G C CGT AACAGT AGCG 240 

Qy 1168 AACACTAAGGAAGAAGTAATT GT GAGGAGT AAAGACA AAG AGG AT T T AGTT T GT AGT 1224 

II I II I I I I I I II I II I I I I I I I I I I I I I II I I I I I I II I I I M I 

Db 241 AATCCTAGGGACGAAATAGTTGTGAGGAGTAGAGATAAAGAAGAGGACTTAGTTAGTCTT 300 

Qy 1225 GC AGC C C T T C AC AGT C C ACAAGAAT C AC CT GTGGGTAAAGAA 1266 

I I I I I I I I I I I I I I I I I III I I I I I 

Db 301 AAC AT C C T T CAT ACT C AGC AG GAGT TAT C T AC AGT CCT T AC GAAAT C AGT T GAAGAAGAA 360 

Qy 1267 GAC AGAGT T GT GT CT CC AGAAAAGACAAT GGACAT T TT TAAT GAAAT GCAGAT GT CAGT A 1326 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I 

Db 3 61 GAT AGAGT T CT GT C T C C AGAAAAAACAAAGGAC AGT TT TAAG GAAAAGGGAGT T GCAGC A 420 

Qy 1327 GTAGCAC CT GT GAGGGAAGAGT ATGCAGACT TTAAGCCATTT GAACAAGCAT GGGAAGT G 1386 



Db 



421 




480 



Qy 



1387 



AAAGAT ACT T AT GAG GGAAGT AGGGAT GT G C T GG CT GC T AGAG CT AAT 



1434 



Db 



481 




540 



Qy 



1435 



GTGGAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGG 



1494 



Db 



541 




600 



Qy 1495 AAGGATAGT GAAGGCAGAAAT GAGGATGCTT CTTT CC C CAGTAC C CCAGAACCT GT GAAG 1554 

II M M II I I I I I I I I I I M I I I I I I II I I I I I I I I I M I I I MM I 
Db 601 AAAGAT AGT GAAAG C AGT AAT GAT G AC AC T T CAT T T C C CAGTAC AC C AGAAG C T GT AAG A 660 

Qy 1555 GAC AGC T C C AGAG CAT AT AT T AC CTGTGCTTCCTT T AC C T C AG C AAC C GAAAG C AC C AC A 1614 

I I M I M M I I I I I I I I I I M M I I M I M M I I I I I I I 

Db 661 G GT GGT T C C GGAGC GT AC AT CAC GTGTGCTCCCTT T AAC C C AAC AACT GAGAAT GT T T C A 720 

Qy 1615 GCAAACACT TTCCCTTTGT T AGAAGAT C AT ACTT C AGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I I M I I Ml M I II I I I M I I I M I I I I I I M II M I I I I I I II I II I M II 

Db 721 ACAAACATTTT T C C CTT GT T GGAAGAT CAT ACTT CGGAAAATAAGACAGATGAAAAAAAG 780 
Qy 167 5 ATAGAAGAAAGGAAGGC CCAAATTATAACAGAGAAGA CTAGCCCCAAAACGTCAAAT 1731 

M II M II I I II M I I II M M II M M M M Ml I II M I M M 

Db 781 AT AGAA- AAAAAAAGGCACAAAT T GTAACAGAGAAGAAT GCAAGT GT CAAGAC AT CAAAC 839 

Qy 1732 CCTTTCCTT GT AGCAGTAC AGGATT C T GAGGC AGAT TAT GT T ACAACAGAT AC C T TAT C A 1791 

II I II I I II I II I I I I M II III II II II II I I I II II I I II II I III 

Db 84 0 CCTTTCCT TAT G G C AGCACAGGAGT CTAAGAC AGAT TAC GT TACAAC AGAT CAT GT GT C A 899 

Qy 1792 AAGGT GACT G AG GCAG C AGT GT C AAACAT GCCT GAAGGT CT GAC G C C AGAT T T AGT T CAG 18 51 

I I I I M I I I I I I II I I I I I I M I II II I I I I II I II I II I I M M II I II I I I 
Db 900 AAGGT GAC C GAGGAAGT AGT GGCAAAC AT GC CT GAAGGT CTAAC C C C AGATT T GGT T CAG 959 

Qy 1852 GAAG CAT GT GAAAGT GAAC T GAAT GAAGC CAC AGGTACAAAGAT T GCT TAT GAAACAAAA 1911 

I II I I II II I II I II I II M I II I I II I II II II II I I I I I I I I II II II M II 

Db 960 GAAGCAT GT GAAAGT GAAT T GAAT GAAG C TACT GGTACAAAAATT G C CT TT GAAACAAAA 1019 

Qy 1912 GT GGACT T G GT C CAAACAT C AGAAG CT AT ACAAGAAT C ACT T TAC C CC ACAG CACAGCTT 1971 

Mill MM Mill I II II I II I I II M M II M II II I I 11 II II M 

Db 102 0 ATGGACCTGGTTCAAACTTCAGAAGCTGTGCAGGAGTCACTTTACCCTGTAACACAGCTT 107 9 

Qy 1972 TGCCCATCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGCCTGATATTGTTATG 2031 

II I II I II I II I I III II I II II I I II I M I I I I II II I II I II I Mill III 

Db 1080 T GCC CAT CTT T T GAAGAAT CT GAAG CT ACT CC GTCACC G GT T T T G CCT GAC ATT GT CAT G 1139 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

II I II I I I II II I I II I I I MM II II II II II I II I I II II M I I MM I 

Db 1140 GAAGCACCATTAAATTCTGTAGTTCCTAGTGCTGGTGCTTCTGCAGTGCAGCTCAGTTCA 1199 

Qy 2092 T C C C CACT GGAAGCAC CT C C T C C AGT TAGT TAT GACAGT AT AAAG C TT GAGC CT GAAAAC 2151 

M Ml I Ml I I MM II M M M M M M M I M I M M M I M II II 

Db 1200 T CAC CAT TAGAAAC TCTTCCTT C AGT T AAT TAT GAAAGCAT AAAGT TT GAG C CT GAAAAT 1259 
Qy 2152 C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AGC ACT AAAAGCTTTGGGAACAAAGGAA 2208 

II I I II II II II I I M I II II I II II II MM I I II I I MM M III 

Db 1260 C C C C CAC CAT AT GAGGAGGC CAT GAAT GT AT CACT AAAAAAAGAAT CAGGAAT GAAT GAA 1319 

Qy 2209 GGAATAAAAGAGC CT GAAAGT TT T AAT GCAGCT GT T CAGGAAACAGAAGCT C CT TAT AT A 2268 

I III I II II II II I I II III II II M II I I I II II II II II II II II II II II 
Db 1320 GAAAT CACAGAGC CT GAAGGT ATT AGT GTAGCT GTT CAGGAAACAGAAGCT C CTT AT AT A 1379 

Qy 2269 T C CAT T G C GT GT GAT T T AAT T AAAGAAACAAAGCT CT C CAC T GAGC CAAGT C CAGAT T TC 2328 

M Mill M M M M M M I M II M M I II MM Mill M I II M M M M 

Db 1380 T CTAT T GCAT GT GAT T T AAT TAAAGAAACAAAGAT CT CT ACT GAAC C GACT C CAGAT T T C 1439 



Qy 2 329 TCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGTG 2388 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I | | | | I I I I I I I I 

Db 14 40 T C T AGT TAT T C AGAAAT AGCAGAAGT T G CAC AGC CAGT GC C CG AGC ATT CT GAGCT AGT T 14 99 

Qy 2389 GAGGATTCCT CAC CTGAATCTGAAC CAGT TGACT TAT TT AGT GAT GATTCGATTCCTGAA 2 4 48 

II I I I I I I I I II II I I I I I I I II I I II I I II I I I I I I I II I II || I II I I III 

Db 1500 GAAGAT TCCTCCCCC GAT T CT GAAC CAGT T GACT TAT T TAGT GAT GAT T CAAT AC C C GAA 1559 

Qy 24 4 9 GT C C CACAAACACAAGAG GAGG CT GT GAT GC T CAT GAAGGAGAGT CT CAC T GAAGT GT C - 2507 

M II I II M MINI II Mill II M MM II I M I II M M I II 

Db 1560 GT T C C ACAAAAACAAGAT GAAGC T GT AAT ACT T GT GAAAGAAAAC C T C ACT GAAAT T T C A 1619 

Qy 25 08 T GAGACAGTAG C C CAGC ACAAAGAGGAGAGAC T TAGT GC CT C AC CTCAGGAG 2559 

I II II MM Ml I M I I I I I I I II I II II I 

Db 1620 T C T GAGT CAAT GACAG GACAT GACAAT AAGGGAAAACT CAGT G CT TC AC CAT C AC CT GAG 1679 

Qy 25 60 CT AGGAAAG C CAT AT T T AGAGT CTT T T C AGCCCAAT T T AC AT AGT ACAAAAGAT GC T 2 616 

I II I II II II II I I I I M II II I I I I I II MM I II I II I I II I 
Db 1680 GGAG GAAAAC C GT AT TT GGAGT CTTTT C AGC C CAGTTT AG GCAT CACAAAAGATAC CTT A 1739 

Qy 2617 GCAT CTAAT GACAT T CCAAC ATT G AC CAAAAAGGAGAAAATTTCTTT GCAAAT GGAAGAG 2676 

Ml II I I II II II II I M I I I II I I I I M M I I I M I M M Mill Ml 

Db 174 0 GCAC CT GAT GAAGTT T CAG CAT T GAC CCAAAAGGAGAAAAT C C C T TT G C AGAT GGAGGAG 17 99 

Qy 2677 T TTAAT ACT GCAAT T T ATT CAAATGAT GACT T AC T TT CT T C T AAGGAAGACAAAATAAAA 2736 

I I M M M II I I II I M II I M II II II I I M II II II M II I I 

Db 1800 CT CAAT ACT GCAGT TT ATT CAAGT GAT G G CT TAT T CAT T GCT CAGGAAGCAAAC CT AAGA 1859 

Qy 2737 GAAAGT GAAAC AT T T T CAG AT T CAT C T C C GAT T G AGAT AAT AGAT GAAT T T C C CAC GT T T 27 96 

I I II II I II II II II II M II I I I I II I I I I II II I II II I II I II II II II III 
Db 1860 GAAAGT G AAACAT T T T C AGATT CAT CT CC GAT T G AGAT TAT AGAT GAGTT CCCGACCT T T 1919 

Qy 27 97 GT CAGT GCT AA AGAT GATT CT C CTAAATT AGC CAAGGAGTACACT GAT CT AGAAGT A 2853 

MUM I I II MM I I II I II I I II I II II II I I I M I I II II II I I M I 

Db 192 0 GT CAGT T CT AAAGCAGATT CTT C T C CTACAT TAG C CAGGGAAT ACACT GAC CT AGAAGT A 197 9 

Qy 28 54 T CC GACAAAAGT GAAAT T GCTAATAT C CAAAG C G GGGC AGAT T C ATT GC CT T GCT T AGAA 2913 

M M M M I M M M II M I M M I Mill I M M 1 MM Ml 

Db 1980 GCCCACAAAAGTGAAATTGCTGACATCCAGGATGGAGCTGGGTCATTGGCTTGTGCAGGA 2039 

Qy 2914 T T G C C CT GT GAC CT TT C TT T CAAGAATAT AT AT C C T AAAGAT GAAGT ACAT GT T T CAGAT 2973 

I II M I I II I I II M M I II M I MM I II I I II II Mill I I I II Mill 

Db 204 0 T T GC CC C AT GAC CT TT CTT T CAAGAGTAT ACAAC C TAAAGAGGAAGT T CAT GT C C CAGAT 2099 

Qy 2974 GAAT T CT CCGAAAAT AGGT C CAGT GT AT CT AAGGC AT C CAT AT CGCCT T CAAAT GT CT CT 3033 

II I I II M II II I II II I I I I II I MM I I I I M I I II II I 

Db 2100 GAGTT CT CCAAAGATAGGG GTGAT GT TT CAAAG GT GC C C GT ACT GCC T CC AGAT GT TT CT 2159 

Qy 3034 GCT T T G GAAC C T CAGACAGAAAT GGG CAG CAT AGT T AAAT C CAAAT C ACT T AC GAAAGAA 3093 

II M M II II II M M II II M I M M I II I I M M Ml II II I M 

Db 2160 GCT T T GG AT GCT C AAGCAGAGAT AG G CAGCAT AGAAAAACC CAAAGT T CT T GT GAAAGAA 2219 

Qy 3094 GCAGAGAAAAAAC TTCCTTCT GAC AC AGAGAAAGAGGACAG AT C C C T GT C AG C T GTATT G 3153 

I I II I I II I I I I II II I II II II II I I I I II I I II II I I I I I I I I II 
Db 2220 GCC GAGAGAAAAC TTCCTTCT GATACAGAAAAAGAGC GAAGAT CT C CAT CT GC T AT AT T T 2279 

Qy 3154 T CAG CAG AGC T G AGT AAAACT T CAGT T GT T G ACCT CCT CT ACT G G AG AGACATT AAG AAG 3213 



Db 2280 T CAGCAGAGCT GAGTAAAACTT CAGTT GTT GACCTC CTCTACT GGAGAGACATTAAGAAG 2339 

Qy 3214 ACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATT 3273 

i I M I I I I I M I II M I I I I I I I I M I I I I I I I I I I I II Mill!!! I I II II I II 
Db 2340 ACTGGAGTGGTGTTTGGTGCCAGCTTGTTCCTGC.TGCTCTCGCTGACAGTATTCAGCATT 2399 

Qy 3274 GTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATA 3333 

II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I II II I I I I I II I I 
Db 2400 GTGAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATA 2459 

Qy 3334 T AT AAGGGC GT GAT CC AGGCT AT C CAGAAAT CAGAT GAAGGC C AC C C ATT C AGG GCATAT 3393 

II I I I I II I I I I I I I I I I I I I I M II I I I I I I I I I I II I I I I I I I I II I II I I I II II 
Db 2460 T ATAAGG GT GT GAT C C AG G CT AT C CAGAAAT CT GAT GAAGGC CAC C CAT T CAGGGCAT AT 2519 

Qy 3394 T T AGAAT C TGAAGTTGCT AT AT CAGAGGAATTGGTT CAGAAAT AC AGTAATTCTGCTCTT 34 53 

I I M M M I M M M I II I II I Mill M M I I II I M Mill M M M M II M 

Db 252 0 TT GGAAT CTGAAGTTGCTATAT CT GAGGAGTTGGT T CAGAAGTACAGCAAT T CT GCT CTT 257 9 

Qy 3454 G GTC AT GT GAAC AGCACAATAAAAGAAC T GAGGC GG C TT T T C T T AGT T GAT GAT T TAGTT 3513 

I I II I I I I III I I I I I I I II I I I I I I I II I I II I I I I I I II I I I I I I I I I I I I I 
Db 2580 GGT C AT GT T AACT G CACAAT AAAAGAAC T CAGAC G C CT CT T CT TAGT T GAT GAT T TAGT T 2639 

Qy 3514 GATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAAT 357 3 

I I I I I I II I I I I I I I I II II I I I I I I I II I I I I I I I I I I I II I I M I II I I II I I I I 
Db 264 0 GATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTCAAT 2699 

Qy 357 4 GGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTATGAA 3633 

I M I I I I II II I II I I I II M II I II I I I I I I I I II II I I I I I I I I I I I II I I I II 

Db 2700 GGTCTGACACTACTAATTTTGGCTCTGATTTCACTCTTCAGTGTTCCTGTTATTTATGAA 2759 

Qy 3 634 C GGC AT C AGGT GCAGAT AGAT C ATT AT CTAG GACT T GCAAACAAGAGT GT TAAGGAT GCC 3693 

II II I I II II Ml II II II II II II M II II II II II I MM MUM Mill 

Db 2760 C GGC AT CAGG C GCAAAT AG AT C ATT AT CT G GGAC T T GCAAAT AAGAAT GT TAAAGAT GCT 2819 

Qy 3694 AT GGC CAAAAT CCAAGCAAAAAT C C C T G GAT T GAAGC GCAAAGCAG A 374 0 

Mill II II M I II I II II M II I II I I II II II I I I I I I I I II 
Db 2820 AT GGCTAAAAT CCAAGCAAAAAT CCCT GGATT GAAGCGTAAAGCT GA 2866 



RESULT 14 
AAV30920 

ID AAV30920 standard; cDNA; 2386 BP. 
XX 

AC AAV30920; 
XX 

DT 14-SEP-1998 (first entry) 
XX 

DE Human secreted protein BG160__1 cDNA. 
XX 

KW BG160_1; secreted protein; protein factor; human; ds . 
XX 

OS Homo sapiens. 
XX 

FH Key Location/Qualifiers 
FT CDS 102. .2030 

FT /*tag= a 



FT sig_peptide 1863. .1899 

FT /*tag= b 

FT /note= "putative leader/signal peptide" 

FT mat_peptide 1900. .2027 

FT /*tag= c 
XX 

PN W09817687-A2. 
XX 

PD 30-APR-1998. 
XX 

PF 24-OCT-1997; 97WO-US019590 . 
XX 

PR 25-OCT-1996; 96US-00740274 . 

PR 24-OCT-1997; 97US-00740274 . 
XX 

PA (GEMY ) GENETICS INST INC. 
XX 

PI Jacobs K, Mccoy JM, Lavallie ER, Racie LA, Merberg D, Treacy M; 

PI Spaulding V, Agostino MJ; 

XX 

DR WPI; 1998-261426/23. 

DR P-PSDB; AAW58383. 
XX 

PT Nucleic acid encoding secreted protein from human cells - useful, e.g. as 

PT immuno-modulators , anti-tumour agents, promoters of tissue growth, 

PT haemostatic and thrombolytic agents etc. 
XX 

PS Claim 20; Page 74-75; 114pp; English. 
XX 

CC This cDNA clone, designated BG160_1, codes for a novel human secreted 

CC protein (see AAW58383) . It was isolated from a human adult brain cDNA 

CC library using methods selective for cDNAs that encode secreted proteins. 

CC The clone is deposited in composite clone ATCC 98232; an oligonucleotide 

CC (see AAT99725) is designed to isolate the clone from the composite. The 

CC predicted AT415_4 amino acid sequence shows homology to neuroendocrine- 

CC specific proteins. Novel cDNA clones (see AAV30916-32 ) coding for human 

CC secreted proteins (see AAW58580-90) are claimed. These can be used for 

CC recombinant production of the secreted proteins for analysis, 

CC characterisation, diagnostic or therapeutic use. They can also be used as 

CC tissue or mol.wt. markers, for chromosome identification, to identify 

CC genetic disorders, to isolate new related DNA, as sources of primers for 

CC PCR, to generate antibodies, and in interaction trap assays. The secreted 

CC proteins may also have many biological activities, e.g. cytokine, 

CC immunomodulator, haematopoiesis regulating activity, tissue growth 

CC activity, activin or inhibin activity, chemotactic or chemokinetic 

CC activity, haemostatic and thrombolytic activity, receptor/ligand 

CC activity, antiinflammatory, cadherin and tumour invasion suppressor 

CC activity, and tumour inhibition activity. The proteins can be expressed 

CC in vivo from DNA, introduced in gene therapy vectors 

XX 

SQ Sequence 2386 BP; 756 A; 450 C; 494 G; 686 T; 0 U; 0 Other; 

Query Match 37.7%; Score 1411.2; DB 2; Length 2386; 
Best Local Similarity 83.3%; Pred. No. 5e-288; 

Matches 1702; Conservative 0; Mismatches 303; Indels 39; Gaps 7; 



Qy 1718 C CAAAAC GTC AAAT CCTTTCCTT GT AG C AGT ACAG GAT T CT GAG GC AGAT TAT GT T AC AA 1777 



Db 1 C CAAAACAT C AAACC CTT T T C T T GT AGC AGCACAG GAT T CT GAGAC AGAT TAT GT C ACAA 60 

Qy 1778 CAGAT AC CT TAT CAAAGGT GAC T GAGG C AG CAGT GT CAAACAT G C CT GAAG GT C T GAC GC 1837 

MINI II I M I I II I I M I I I I I II I I I I i II I I I I I I II I II I Mill I 

Db 61 CAGAT AATT TAACAAAGGT GACT GAGGAAGT C GT G GCAAACAT GC C T GAAG G C C T GACT C 120 

Qy 18 38 CAGAT T T AGT T C AGGAAG CAT G T GAAAG T GAACT GAAT GAAG C C AC AG GT AC AAAG AT T G 1897 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I M 
Db 121 CAGATTTAGT ACAGGAAGCATGTGAAAGT GAATT GAAT GAAGTT ACT GGT ACAAAGATT G 180 

Qy 1898 CTTAT GAAACAAAAGT GGACTT GGT C CAAACAT CAGAAGCTAT ACAAGAAT CACTTTAC C 1957 

I I I II I I I I II I I I I I I I I I I I I I II I I I I II II I I I III I II I I II I I I II I 
Db 181 C T TAT GAAAC AAAAAT GGAC T T GGT T CAAACAT CAGAAGT T AT G C AAGAGT CAC T C TAT C 240 

Qy 1958 CCACAGCACAGCTTTGCCCATCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGC 2017 

I II M I I I I I I I I II M I I I I I I I I I I I I INN Mill M M M I II II M 

Db 241 C T G CAGCAC AGC T T T G C C CAT C ATT T GAAGAGT C AGAAG CT AC T C CT T CAC CAGTT T T GC 300 

Qy 2018 CTGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAG 2 077 

I I I I I I I I I I II M II I I II I II I II II II MM II I II I II I II M II 

Db 301 C T GACAT T GT TAT GGAAG CAC C ATT GAAT T CT GCAGTT C CT AGT GCT G GT G CT T C C GT G A 360 

Qy 207 8 T GCAGCCCAGT GT AT CCCCACT GGAAGCACCT CCT CCAGTT AGTTAT GAC AGT AT AAAGC 2137 

I I I II II I I III III I MM M II I I I II I I I I II I II II I I I I 
Db 361 TACAGC CCAGCT CAT CAC C ATT AGAAG CTT CTT CAGTTAAT TAT GAAAGC AT AAAAC 417 

Qy 2138 T T GAGC C T GAAAAC C C C C CAC CAT AT GAAGAAGC C AT GAAT GTAGCACT AAAAGCTT 2194 

II I II II I M M M M M M II II II II M M M M I MM MM Mill I 

Db 418 AT GAGC C T GAAAAC C C C C CAC CAT AT GAAGAGG C CAT GAGT GTAT CAC TAAAAAAAGT AT 477 

Qy 2195 T GGGAACAAAG GAAG GAAT AAAAGAGCC T GAAAGT T TT AAT GCAGCT GT T C AGGAAAC AG 2254 

I I I I II II II II III II I I I I II I I M I I I II M II II M I I II M II I I I 

Db 478 CAGGAATAAAGGAAGAAATTAAAGAGCCTGAAAATATTAAT GCAGCT CTT CAAGAAACAG 537 

Qy 22 55 AAGC T CCT TAT AT AT C CAT T G C GT GT GAT TTAATTAAAGAAACAAAG C T CT C CAC T GAGC 2314 

I II I II II II I II II I II I II II M I II II II I I II I II I I I I II II II MM I 

Db 538 AAGCT CCT T AT AT AT CT AT T GC AT GT GAT T T AATT AAAGAAACAAAGCT T T CT GCT GAAC 597 

Qy 2315 CAAGT C CAGATT T CT CT AAT T ATT CAGAAAT AGCAAAAT T C GAGAAGT CGGT GC C C GAAC 2374 

II III I II II II II M II II II II I II IMIM I II II I II II I II I 

Db 598 C AGCTC C GGAT T T CT C T GAT TAT T CAGAAAT GGCAAAAGTT GAAC AGC CAGT G C CT GAT C 657 

Qy 2375 ACGCTGAGCTAGT GGAGGAT TC CTCACCT GAAT CT GAAC CAGTT GACTTAT TT AGT GAT G 2434 

I II M M M M M M I M M M M I M M M M M M II M M M M M M M M 

Db 658 AT T C T GAG CT AGT T GAAGAT T C C T CAC C T GAT T CT GAAC CAGT T GAC T TAT T T AGT GAT G 717 

Qy 2435 ATT C GAT T C C T GAAGT C C C ACAAAC ACAAGAG GAGG CT GT GAT G CT CAT GAAGGAGAGT C 24 94 

II I I II II II I II I II I I II I I II II II II I I II II II II II M I II I 

Db 718 ATT C AAT AC C T GAC GT T CC AC AAAAAC AAGAT GAAACT GT GAT GCT T GT GAAAGAAAGT C 777 

Qy 2495 TCACTGA AGT GT CT GAGACAGT AGCC CAGCACAAAGAGGAGAGACTT AGT GC CT 2548 

II I I II I II I III I III II I I I II I II II I I 

Db 77 8 T CACTGAGAC T T CAT T T GAGT CAAT GAT AGAATAT GAAAAT AAGGAAAAACT CAGT G CT T 837 

Qy 254 9 C AC CT C AG GAG C TAG GAAAG C CAT AT T TAG AGT CT T T T C AGC C CAAT T T AC AT AGT AC AA 2608 

II I Ml I M M M M M I I M II IMIM Ml M MM Ml MM 



Db 838 T GC C AC C T GAGGGAG GAAAGC C AT AT T T GGAAT C T T T T AAGCT CAGTT TAGAT AACACAA 897 

Qy 2609 AAGATGC T G C AT C T AAT G AC AT T C C AAC AT T G AC C AAAAAG G AG AAAAT TTCTTTGC 2665 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I 

Db 898 AAGAT AC C CT GTT AC CT GAT GAAGTT T CAAC AT T GAG CAAAAAGGAGAAAAT T C CTTT GC 957 

Qy 2666 AAAT G GAAGAGT T TAAT ACT G CAAT T TAT T CAAAT GAT GACT TAC T T T C T T C T AAGGAAG 2725 

I I I I I I III I I I II I I I I I I I I I I I I I I II I I I I I I I I II I I II I I I I I I I 
Db 958 AGAT G GAG GAGCT CAGTAC T GC AGT T TAT T CAAAT GAT GAC T TAT T TAT T T CTAAGGAAG 1017 

Qy 2726 ACAAAATAAAAGAAAGT GAAAC AT T T T CAGAT T C AT CT C C GAT T G AG AT AAT AGAT G AAT 2785 

I I I I I Mill 1111)1 I I I I M I I II I M II II I I I I I II I I I II I I I I 
Db 1018 C AC AGAT AAGAGAAAC T GAAAC GT T T T CAGAT T CAT C T C CAAT T GAAAT TAT AGAT GAGT 1077 

Qy 2786 TTCCCACGTTTGTCAGTGCTAAAGATGATTC T CCTAAATTAGC CAAGGAGT ACAC T G 2 842 

I I I I I II I I II I I I I II I I II I I I I I I I I I I I I I II I I I II II I I 
Db 107 8 TCCCTACATTGATCAGTTCT7\AAACTGATTCATTTTCTAAATTAGCCAGGGAATATACTG 1137 

Qy 2 843 AT CT AGAAGT AT C C GACAAAAGT GAAATT G CT AATAT C CAAAG C GGGGC AGATT C ATT GC 2 902 

I I I I I I I I I I I II I I II I I II I I I I I II I I I I I II Mill M I II II 

Db 113 8 AC CT AGAAGT AT C C CACAAAAGT GAAAT T GC TAAT GC C CC G GAT G GAGCT GGGT CAT T GC 1197 

Qy 2903 CTTGCTTAGAATT GCCCTGT GAC CTTTCTTT CAAGAATAT ATAT CCTAAAGAT GAAG 2959 

MM! M II M M M M M M M M M Mill Ml I I I MM I M M 

Db 1198 C T T G C AC AG AAT T G C C C CAT GAC CTTTCTTT GAAG AAC AT AC AAC C C AAAGT T GAAG AG A 1257 

Qy 2 960 T ACAT GT T T CAGAT GAAT T CT C C GAAAATAGGT C CAGT GTATC TAAG G CAT C CAT AT 3016 

I I I II II II I I II M I II II I II I I II I II M MM 
Db 1258 AAAT C AGT TT C TC AGAT GACT T T T CT AAAAAT GG GTC T GCT ACAT CAAAG GT GCT CTT AT 1317 

Qy 3017 C GCCT T CAAAT GTCTCTGCTTT GGAACCT CAGAC AGAAAT GGGCAGC AT AGT T AAAT C C A 3076 

I I II II I I I 1 I II I M II II MM II M II I I II I I II II II I I II 
Db 1318 T GCCT CCAGAT GTTTCT GCTTT GGCCACT CAAGCAGAGATAGAGAGCATAGTTAAAC CCA 1377 

Qy 3077 AAT C ACT TAC GAAAGAAGC AGAGAAAAAACT T C CT T C T GACACAGAGAAAGAGGACAGAT 3136 

M Ml II M I M M M I M M M M I M II I II Mill M M M II M M I 

Db 137 8 AAGTT CT T GT GAAAGAAGCT GAGAAAAAACTT CCTTCC GATACAGAAAAAGAGGACAGAT 1437 

Qy 3137 CCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTCTACT 3196 

I I II III MM II I II M I II I M I II I II II II II M II II M II II I MM 

Db 1438 C AC CAT C T GCTAT AT T T T C AGC AGAG CT GAGTAAAAC T T CAGTT GT T GAC CT C CT GTACT 14 97 

Qy 3197 GGAGAGACATTAAGAAGACT GGAGT GGT GTTT GGT GC CAGCTTATT CCT GCT GCT GTCT C 3256 

I II II II II II II I I I II M II II II II I II II I M I I I II II II I II II II II II 
Db 1498 GGAGAGACAT T AAGAAGACT GGAGT GGT GTT T GGT GCCAGC CT AT T CCT GCT GCT T T CAT 1557 

Qy 3257 TGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGA 3316 

I I I II I I I I I II II I I I I II Mill II M II I II II I II I II II M II II I MM 
Db 1558 TGACAGTATTCAGCATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGA 1617 

Qy 3317 C TAT C AGCTT T AGGAT ATATAAGGGC GT GAT C CAGGC TAT C CAGAAAT CAGAT GAAGG C C 337 6 

I II I M M M M M M II Mill M II M M M M M II M I M II M M M M M 

Db 1618 C CAT CAGCTTTAGGAT ATACAAGGGTGT GATCCAAGCTAT CCAGAAATCAGAT GAAGGC C 1677 

Qy 3377 AC C CAT T CAG G GC AT AT T T AGAAT CT GAAGT T GCT AT AT CAGAGGAATT GGT T CAGAAAT 3436 

I M M II II II II I I I M I II II II I II II I M II I II I I I I 

Db 1678 ACCCATTCAGG GAAGTT GCT AT AT CT GAGGAGTT GGT T CAGAAGT 1722 



Qy 3437 ACAGTAATTCTGCTCTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCT 3496 

M ! M M ! I II I M I M I I II M II M II MM M M I M M I M M I M MM 

Db 1723 ACAGTAATTCTGCTCTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCT 1782 

Qy 34 97 TAGT T GAT GATT TAGT T GAT T C C CT GAAGT T T GCAGT GT T GAT GT GG GT GT TTAC T TAT G 3556 

M M M M M M M M M M M M M M M M M M M I M M M M I M M I MM 

Db 17 83 T AGT T GAT GATT TAGT TGAT T CT C T GAAGT T T GCAGT GT T GAT GT G GGT AT TT AC CT AT G 18 42 

Qy 3557 TTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTA 3616 

M M M M M M I M M M M M M M M M M M I M M I M M M M M M M 

Db 1843 TTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTG 1902 

Qy 3617 T T C CT GT T AT TT AT GAAC GGC AT CAGGT GC AGATAGAT CAT TAT CT AG GAC TT G CAAAC A 3676 

M M M M M M M M M I I I M M M M M M M M M M M M I M I M M M I I 

Db 1903 T T C C T GT TAT TT AT GAAC G GC AT CAGGCAC AGATAGAT CAT TAT CTAGGACTT GCAAAT A 1962 

Qy 3677 AGAGT GT TAAGGAT GC CAT GGC C AAAAT C CAAGCAAAAATC C CT G GAT T GAAG C GCAAAG 3736 

Ml M M M M M I M M I M M M M M M M M M M M M M M M M M M I 

Db 1963 AG AAT G T T AAAGAT G C TAT GGC T AAAAT C C AAG C AAAAAT C C C T G GAT T GAAG C G C AAAG 2022 

Qy 3737 CAGA 3740 

I I I 

Db 2023 CTGA 2026 



RESULT 15 
AAF98399 

ID AAF98399 standard; cDNA; 2386 BP. 
XX 

AC AAF98399; 
XX 

DT 07-JUN-2001 (first entry) 
XX 

DE Human cDNA clone BG160_1 sequence SEQ ID 41. 
XX 

KW Human; secreted protein; nutrient; cytokine modulator; proliferation; 

KW differentiation; immune system modulator; tissue growth; chemotactic; 

KW haemostatic; thrombolytic; anti-inflammatory; tumour inhibition; ss; 

KW haematopoiesis . 
XX 

OS Homo sapiens. 
XX 

PN WO200119988-A1. 
XX 

PD 22-MAR-2001. 
XX 

PF 14-SEP-2000; 2 000WO-US025135 . 
XX 

PR 17-SEP-1999; 99US-00398 829 . 
XX 

PA (GEMY ) GENETICS INST INC. 
XX 

PI Jacobs K, Mccoy JM, Lavallie ER, Collins-Racie LA, Evans C; 

PI Merberg D, Treacy M, Bowman MR, Spaulding V, Agostino MJ; 
XX 

DR WPI; 2001-244801/25. 



DR P-PSDB; AAB90682. 
XX 

PT Isolated nucleic acids encoding polypeptides, useful for modulating e.g. 

PT cytokine and cell proliferation/differentiation activity, the immune 

PT system and hematopoiesis regulating activity. 
XX 

PS Claim 1; Page 408-409; 557pp; English. 
XX 

CC Human cDNA clones represented in AAF98374 - AAF98489 encode secreted 

CC proteins AAB90667 - AAB90750. The cDNA clones are isolated from various 

CC tissue types, and may be used in the prevention, treatment and diagnosis 

CC of diseases associated with inappropriate protein expression. The 

CC polypeptides and nucleic acids may be used as nutrients or to modulate 

CC cytokine and cell proliferation/differentiation activity and may also be 

CC involved in modulation of the immune system. The cDNA sequences, 

CC proteins, their agonists and/or antagonists exhibit haematopoiesis 

CC regulating activity; tissue growth activity; activin/inhibin activity; 

CC chemotactic/chemokinetic activity; haemostatic and thrombolytic activity; 

CC receptor/ligand activity; anti-inflammatory activity; haematopoiesis 

CC activity; cadher in/ tumour suppressor activity; and/or tumour inhibition 

CC activity. Included in the invention are probes represented in AAF984 90 - 

CC AAF98572 which are specific for the cDNA clones encoding the secreted 

CC proteins 

XX 

SQ Sequence 2386 BP; 756 A; 448 C; 496 G; 686 T; 0 U; 0 Other; 

Query Match 37.6%; Score 1408; DB 5; Length 2386; 

Best Local Similarity 83.2%; Pred. No. 2.4e-287; 

Matches 1700; Conservative 0; Mismatches 305; Indels 39; Gaps 7; 

Qy 1718 C CAAAAC GT CAAAT CCT TT CCT T GT AGC AGT ACAGGAT T CT GAG GCAGAT T AT GT T ACAA 1777 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I I 

Db 1 C CAAAACAT CAAAC C CTTT T CTT GT AGC AGCACAG GATT CT GAGACAGATT AT GT CACAA 60 

Qy 177 8 CAGAT AC CT T AT C AAAGGT GAC T GAG GC AGC AGT GT CAAAC AT GC CT GAAGGT C T GAC GC 1837 

M I I I I III II I I I I I I I II I I I I II III II I I M I I I I I I I I I I I II I I I 
Db 61 CAGATAATT TAACAAAGGT GACT GAGGAAGT C GT GGCAAACATGC CT GAAGGC CT GACT C 12 0 

Qy 1838 CAGAT T T AGT T C AG GAAGC AT GT GAAAGT GAAC T GAAT GAAGC C AC AGGT AC AAAGAT T G 1897 

II I I M II I I I I I I I II I I II I I II II I I I I MINIMI II I II I I I I II I II I 

Db 121 CAGAT T T AGTACAGGAAGC AT GT GAAAGT GAATT GAAT GAAGTT ACT GGT ACAAAGAT T G 18 0 

Qy 1898 CTTATGAAACAAAAGT GGACTT GGT C CAAACAT CAGAAGCTATACAAGAAT CACTTTACC 1957 

I I I I I I I II I I I I I I II I I I I II I I I I I I I I I I I I I I III Mill I I II I II I 
Db 181 CTT AT GAAACAAAAAT GGACT T GGT T CAAACAT C AGAAGT T AT GCAAGAGT CACT CT AT C 24 0 

Qy 1958 CCACAGCACAGCTTTGCCCATCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGC 2017 

I M I I I II I II II I II II I II I I II I II I Mill Mill I II II I II I M II 
Db 241 CTGCAGCACAGCTTTGCCCATCATTTGAAGAGTCAGAAGCTACTCCTTCACCAGTTTTGC 300 

Qy 2018 CTGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAG 2077 

INI I I M I I M M I M I M I M I M I M I MM M M M II M M I M 

Db 301 CTGACATTGTTATGGAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGA 360 

Qy 2078 TGCAGCCCAGTGTATCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGC 2137 

I M I II I II III Ml I MM II II MIMI I II I II M Mill I 

Db 3 61 T AC AGC C C AGCT CAT C ACC AT T AGAAG C T T CT T C AGT T AAT T AT GAAAGC AT AAAAC 417 



Qy 2138 T T GAG C C T G AAAAC C C C C C AC CAT AT G AAG AAG C CAT G AAT G T AG C AC T AAAAGCTT 2194 

I II I M II I I I I I I I M II I I M I I I II M I I I I I M I I II MM Mill I 

Db 418 AT GAG C CT GAAAAC C C C C CAC CAT AT GAAGAGGC C AT GAGT GT AT CACTAAAAAAAGT AT 47 7 

Qy 2195 T GGGAACAAAGGAAG GAAT AAAAGAG C CT GAAAGT T T T AAT G C AGCT GT T CAGGAAACAG 2254 

MM I I M II M III II II M II I I M I I II I I II II I M MM M I II II 

Db 4 78 CAGGAATAAAGGAAGAAATTAAAGAGCCT GAAAAT ATTAAT GCAGCT CTT CAAGAAACAG 537 

Qy 2255 AAGCT C CT T AT AT AT CCAT T G C GT GT GAT T T AAT T AAAGAAACAAAG CT CT CCACT G AGC 2314 

I II II II I II II II I I I I I II II II II II I M M I I II II I I I M M II II II I 

Db 538 AAGCT C CTT AT AT AT CTATTGCAT GT GATTTAATTAAAGAAACAAAGCTTT CT GCT GAAC 597 

Qy 2315 C AAGT C C AGAT T T C T CT AATT AT T C AGAAATAGCAAAAT T CGAGAAGT C GGT G C C C GAAC 2374 

II III II II II I II I II I II II I I I II II II I I I M M I II I II II I 

Db 598 CAGC T C C GGAT T T C T CT GATT AT T C AGAAAT GG CAAAAGT T GAACAG C C AGT GC CT GATC 657 

Qy 2375 AC GCT GAG C T AG T G GAG GAT T C C T CAC C T GAAT C T GAAC C AGT T GAC T TAT T T AGT GAT G 2434 

I I II I I I I I II II II II II II M II II II II II II M II II II II M II II I I II 
Db 658 AT T CT GAG CT AGTT GAAGATT C CT CAC CT GAT T C T GAAC CAGT T GACT T AT TT AGT GATG 717 

Qy 2435 AT T C GAT T C C T GAAGT C C C AC AAAC ACAAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT C 2494 

I I II II II II I II I I II II I I I II II M II M II II II II I I II MM 
Db 718 AT T CAAT ACCT GACGT T C CACAAAAACAAG AT GAAACT GT GAT GCT T GT GAAAGAAAGT C 777 

Qy 24 95 TCACTGA AGT GT CT GAGACAGT AGC C C AG C ACAAAG AGGAGAGACTT AGT GC CT 254 8 

I II II II II I III I III I I II I II I I I M I I 

Db 778 T CACTGAGACTT CAT TT GAGT CAAT GAT AGAATAT GAAAAT AAGGAAAAACTCAGTGCTT 837 

Qy 254 9 C AC CT C AGGAG CTAG GAAAGC CAT AT TT AGAGT C T T T T C AGC C CAAT T T ACAT AGTACAA 2608 

Ml II I I M II M II II M II M M M M I M M M I I M I Mil 

Db 838 T GC CAC C T GAG GGAG GAAAGC C AT AT TT GGAAT C T T T TAAGCT CAGT T T AGAT AACACAA 897 

Qy 2609 AAGATGC T G CAT C T AAT GAC AT T C C AAC AT T GAC C AAAAAG G AG AAAAT TTCTTTGC 2665 

I II II I I I I I I I I II II II I II II II I I II II I I II II I I M I II I 

Db 898 AAGATACC CTGT TACCT GATGAAGTTTCAACATT GAGCAAAAAGGAGAAAATT CCTTT GC 957 

Qy 2666 AAAT GGAAGAGT TT AAT ACT GCAAT T T AT T CAAAT GAT GACTT ACT T T CT T CT AAGGAAG 2725 

I II I I I III I I II I M II I I I II II I I I I I I I II II I I II II I II II I II I 
Db 958 AGAT G GAGGAGCT CAGT ACT G CAGTTT AT T CAAAT GAT GACTT ATTTATTT CT AAG GAAG 1017 

Qy 2726 ACAAAAT AAAAGAAAGT GAAACATT T T CAGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT 2785 

I I II I Mill MUM II II II II I I I II M II II II I II M II II I I I 
Db 1018 CAC AGAT AAGAGAAAC T GAAAC GT T T T C AGATT CAT CT C CAAT T GAAAT TAT AGAT GAGT 1077 

Qy 2786 TTCCCACGTTTGT CAGT GCT AAAGAT GATT C T C CTAAATT AGC CAAG GAGT ACAC T G 2842 

I II II II II II I I I II I II II II I II II II I I I I II M I M I M I 
Db 1078 T CC C T ACAT T GAT C AGT T C T AAAAC T GAT T CAT T T T CTAAATT AGC CAGG GAAT AT ACT G 1137 

Qy 2843 AT CT AGAAGT AT C C GACAAAAGT GAAAT T G CT AAT AT C C AAAGC GGGGC AGAT T CAT T GC 2902 

I II II II II II II I II I I I I II II II I I I I I I I II I I II I I M M I I 

Db 1138 ACCT AGAAGT AT CCCACAAAAGT GAAATT GCT AAT GCCCCGGAT GGAGCT GGGT CATT GC 1197 

Qy 2903 CT T GCT TAGAATT GCC CTGT GAC CTT TCTTTCAAGAAT AT ATATCCT AAAGAT GAAG 2959 

I II I I II II I II II I I I I II I II II II Mill III I II II II I I II I 
Db 1198 C T T G C AC AGAAT T GC C C CAT GAC CT T T C T T T G AAGAAC AT AC AAC C C AAAGT T GAAGAGA 1257 



2960 T AC AT GT T T C AGAT GAAT TCT C C GAAAAT AG GT C C AGT GT AT CTAAGGC AT C CAT AT 3016 

I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I | I II 

1258 AAATCAGTTTCTCAGATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTAT 1317 

3 017 C G C CT T CAAAT GT CTCTGCTTTG GAAC CT C AGAC AGAAAT G GG C AGCAT AGT T AAAT C CA 307 6 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I | | | | | | | | M I I I 
1318 T G C CT C CAGAT GTTTCTGCTTT GGC C ACT CAAG C AGAGAT AGAGAG CAT AGT TAAAC C C A 1377 

3077 AAT CACTTACGAAAGAAGCAGAGAAAAAACTT CCTT CT GACACAGAGAAAGAGGACAGAT 3136 

II IN I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I | || | | | | | | 
1378 AAGT T CT T GT GAAAGAAGC T GAGAAAAAACT T C CT T C C GAT ACAGAAAAAGAG GACAGAT 1437 

3137 CCCTGTCAGCTGTATTGTCAGCAGAGCTGAGTAAAACTTCAGTTGTTGACCTCCTCTACT 3196 

I I II Ml I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I | | | | | M I 
1438 C AC CAT CT G CT AT AT T T T C AGCAGAGC T GAGT AAAACT T C AGT T GT T GAC C T C CT GT ACT 1497 

3197 GGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTC 3256 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I M | | | | | | || 
14 98 GGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCAT 1557 

3257 TGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGA 3316 

I I I I I I I I I I I I I II I I I I I I II I I I I I I M | | | | | | | | | | | | | | | | | | | 

1558 TGACAGTATTCAGCATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGA 1617 

3317 CTATCAGCTTTAGGATATATAAGGGCGT GAT CCAGGCTATC CAGAAAT CAGAT GAAGGC C 3376 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I M I M I I I I I I I I I I 
1618 C CAT C AGCTT T AG GATAT AC AAGGGT GT GAT C CAAGCT AT C CAGAAAT CAGAT GAAGGC C 1677 

3377 AC CCAT T CAGGGCAT ATTTAGAAT CT GAAGTT GCTATAT CAGAGGAATT GGTT CAGAAAT 3436 

N I I I I I II I I I I I I I I I I I I I I I I | | | I I I I I I I I I I I I I I 

167 8 ACCCATTCAGG GAAGTT GCTATAT CTGAGGAGTTGGTTCAGAAGT 1722 

3437 ACAGTAATTCTGCTCTTGGTCATGTGAACAGCACAATAAAAGAACTGAGGCGGCTTTTCT 3496 

I I I I M I I I I I I I I I I I I I M I I I I I I I I MM I II II II II I I | M I 

1723 ACAGTAATTCTGCTCTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCT 1782 

34 97 T AGT T GAT GAT T T AGT T GAT T C C C T GAAGTT T GC AGT GT T GAT GT GGGT GT T TACT TAT G 3556 

I I N I M II I II I I II I II || I I I | | || II II II II II II || I MM 

17 83 T AGT T GAT GAT T T AGT T GAT T C T CT GAAGT T T G C AGT GT T GAT GT GGGT AT T T AC CT AT G 1842 

3557 TTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTA 3 616 

I I I I M I II II II II || || || I II II II II I I || || || | II M I II II II II I || 

1843 TTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTG 1902 

3617 T T C C T GT T ATT T AT GAAC GGC AT C AGGT GCAGAT AGAT CAT T AT CT AGGAC TT GCAAAC A 3676 

II M I II M II II II I II II I II II || II II I I II II II II I M II II II II M I 

1903 T T GGT GT TAT T TAT GAAC GG CAT C AGGC ACAGAT AGAT CAT TAT CT AGGACT T GCAAAT A 1962 

3677 AGAGT GT TAAGGAT GC CAT GGC CAAAAT C CAAG CAAAAAT C C C T G GATT GAAG C GCAAAG 3736 

I I I I M II I MMI II I II I I II I I II I I I || || M II II II II I M I II II II II 
19 63 AGAAT GT T AAAGAT GC T AT GG C TAAAAT C C AAGC AAAAAT C C C T G GAT T GAAG C GCAAAG 2022 

3737 CAGA 3740 
I I I 

2023 CTGA 2026 
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RESULT 1 

US-09-484-970B-106 

; Sequence 106, Application US/09484970B 

; Patent No. 6426186 

; GENERAL INFORMATION: 

; APPLICANT: Jones, Karen A. 

APPLICANT: Volkmuth, Wayne 
; APPLICANT: Walker, Michael G. 
; TITLE OF INVENTION: BONE REMODELING GENES 
; FILE REFERENCE: PB-0014 US 

; CURRENT APPLICATION NUMBER: US/ 09/4 8 4 , 97 OB 
; CURRENT FILING DATE: 2000-01-18 
; NUMBER OF SEQ ID NOS : 172 

SOFTWARE : PERL Program 
; SEQ ID NO 106 

LENGTH: 4 822 
; TYPE : DNA 
; ORGANISM: Homo sapiens 



FEATURE : 

NAME/ KEY : misc_f eature 

OTHER INFORMATION: Incyte ID No. 6426186 444857. 15CB1 
NAME/KEY: unsure 

LOCATION: 33, 51, 79, 211, 369, 483-484, 731, 748, 4803, 4805-4806, 4808- 
4809, 

; OTHER INFORMATION: a, t, c, g, or other 
US-09-484-970B-106 

Query Match 62.1%; Score 2323.8; DB 4; Length 4822; 

Best Local Similarity 80.9%; Pred. No. 0; 

Matches 3060; Conservative 0; Mismatches 587; Indels 137; Gaps 25; 

Qy 63 CGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCG 122 

I II MINI II I I I I I I I I II I I I I I I M I I I I I I 1 I I II I I I I I I I IN 
Db 7 8 CNCGGAGGCAGGAGGAGCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCG 137 

Qy 123 GCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAAC 182 

I I I I I I II I I I I I I I I I I I I I I I I II I II I I II I I I 

Db 138 GCTCAGT CGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAAC 182 

Qy 183 CGCCCGCGACTCTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCG 241 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I 1 I I I 

Db 183 CGCCCGCGGCTCTGAGACGCGGCCCCGGNGGCGGCGGCAGCAGCTGCAGCATCATC-TCC 241 

Qy 242 ACCCGCCAGCCATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCC 301 

I I II I M I II II I I I I I I I I I I I I I I I I I I I I I I M I II I I I I I I I I I I I 
Db 242 AC C C T C C AGC C AT GGAAGAC CT G GAC CAGT CT C CT CT GGT CTCGTCCTCGGACAGCC 298 

Qy 3 02 CGCCCCGGCCTCCGCC C GC CTT CAAGT AC CAGT T C GT GAC GG AGC C C GAGGAC GAG GAGG 361 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 99 C AC CCCGGCCG CAGC C C GC GTT CAAGTAC CAGT T C GT GAG GGAGC C C GAG GAC GAG GAG - 357 

Qy 3 62 AC GAGGAGGAG G AGGAGGAC GAGGAG GAG GAC GAC GAGGAC CTAGAGGAACT GGAG GT GC 421 

II II I I I I I II II I I I I I I I I II II I I I I I I II I I I I I I I I I I I I I I I 
Db 358 — GAAGAAGAGGANGATGAAGAGGAGGACGAGGACGAAGACCTGGAGGAGCTGGAGGTGC 415 

Qy 422 TGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCG 475 

I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 

Db 416 TGGAGAGGAAGCCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCG 475 

Qy 476 CCGCGCCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGC 535 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 476 GCGCGCCNNTAATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGC 535 

Qy 536 CGGCCGCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG 58 9 

I I I I I I I I I I I I I III II III I I I I I I I II I I I I I I I I I I I I 
Db 53 6 CGGCCGCTCCCCCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGT 595 

Qy 590 CGGCGCCCGCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAG 64 6 

I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 596 CGACCGTGCCCGCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTG 655 

Qy 647 AGGACGACGAGCCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGG 706 

I I I I I I I I I I I I II I I I I I I I I I II I I I I I II II III III I I I M I II II 
Db 656 AGGACGACGAGCCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGG 715 



Qy 707 CGGAGC CCGCCGCGCCCCCTTCCACGCCGGCCG 739 

I I I I I I I I I I I M 1 1 I I I I I I I I I I I I I I I 

Db 716 CAGAGCCCGTGTGG7^JCCCGCCAGCCCCGGCTNCCGCCGCGCCCCCCTCCACCCCGGCCG 775 

Qy 740 CGCCCAAGCGCAGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTG 796 

I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I I I I I I 
Db 776 CGCCCAAGCGCAGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTG 835 

Qy 7 97 CAT CT GAGC CT GT GAT AC CCTCCTCT GC AGAAAAAAT TAT GGAT T T GAT G GAGC AGC C AG 856 

I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I III! I I I I I II I I M 
Db 836 CAT CT GAGC CT GT GAT ACGCT CCT CT GCAGAAAA TATGGACTTGAAGGAGCAGCCAG 892 

Qy 857 GTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCT 916 

I II I II I I I I I I I I I I I II I I II II I I I I I I I I I I I I i I I I I I I I I I I I I I II I I I I 
Db 893 GTAACACTATTTCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTT 952 

Qy 917 CTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTA 976 

I I I II ! I I 1 I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I II I I I I I I 
Db 953 CTCTTCCTTCTCTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTA 1012 

Qy 977 ACTTAT CAGCAGT GT CAT CCTCAGAAGGAACAATT GAAGAAACTTTAAAT GAAGCTTCT A 1036 

I I I III MM I I II I I I II I I I I I II II M II I I I II I I I I I II I I 

Db 1013 AT TT GT CAACAGT AT T AC C C ACT GAAG GAAC AC T T CAAGAAAAT GT C AGT GAAGCT T CT A 1072 

Qy 1037 AAG AG T T G C C AG AG AG G G C AAC AAAT C CAT T T G T AAAT AG AG AT T TAG C AGAAT T T T C AG 1096 

II I M I I I I M I M II I II II I I I I M M II I II I I I M I I II II I I 

Db 1073 AAGAGGTCTCAGAGAAGGCAAAAACTCTACTCATAGATAGAGATTTAACAGAGTTTTCAG 1132 

Qy 1097 AATT AGAAT AT T C AGAAAT GGGAT CAT C T T T T AAAG GC T C C C CAAAAGGAGAGT C AGC C A 1156 

I I I I It I II I I II II I II I I I II I II I II 1 I Ml I I I II M III II III 

Db 1133 AATT AGAAT ACT C AGAAAT GGGAT C AT CGT T C AGT GT C T CT C CAAAAGC AGAAT CT GC C G 1192 

Qy 1157 TATTAGT AGAAAACACTAAGGAAGAAGTAATT GT GAGGAGTAAA GACAAAGAGGATT 1213 

II I II I II I I I II I I I I I I II I I I I II I I Mill II I I II I I I 

Db 1193 T AAT AGT AG C AAAT C C T AGG GAAGAAAT AAT CGT GAAAAAT AAAGAT GAAGAAGAGAAGT 1252 

Qy 1214 T AGT T T GT AGT GC AG C C CT T CAC AGT C C AC AAGAAT C AC CT GTGG 1258 

I I I I I M I I II I I II I I I I II I I I I I I I I III 

Db 1253 T AGT T AGT AATAACAT C CT T CAT AAT CAAC AAGAGT T AC CTAC AGCT CT T AC TAAAT T G G 1312 

Qy 1259 GT AAAGAAGAC AGAGT T GT GT CT C C AGAAAAGACAAT GGAC AT T T T TAAT GAAAT G C AG A 1318 

I I I II I II I I I I M I I I I I I I M M II I MM I I I II II M I I I 

Db 1313 T T AAAGAG GAT GAAGT T GT GT C T T C AGAAAAAGCAAAAGACAGT T T TAAT GAAAAGAGAG 1372 

Qy 1319 T GT C AGT AGT AGC AC CT GT GAG G GAAGAGT AT GCAGAC T TT AAG C CAT T T GAACAAGC AT 1378 

I I II I I III III II I I II I II I I M II I I II I II I II I II II I I I II 

Db 1373 T T GCAGT GGAAG CT C C TAT GAGGGAG GAAT AT G CAGAC T T CAAAC CAT T T GAGC GAGT AT 14 32 

Qy 137 9 GGGAAGT GAAAGAT ACTTAT GAGG GAAGT AGG GAT GT GCT GGCT GCT AGAGCT 1431 

II II II II I I II I II I II I I I II I I I I I II I I II I I I II I 

Db 1433 GGGAAGT GAAAGAT A GTAAGGAAGAT AGT GATATGTT GGCT GCT GGAGGTAAAAT C G 14 89 

Qy 1432 AAT GT G GAAAGTAAAGT GGAC AGAAAAT GCT T G GAAGAT AGC C T GGAG CAAAAAA 1486 

II I II II II II I II I I II I I I II II II I I M I I I I I I II I I I I I I 
Db 14 90 AGAGCAACT T GGAAAGTAAAGT G GAT AAAAAAT GT T TT GC AGAT AGC C T T GAGCAAACT A 154 9 



Qy 



1487 GT CT T G GGAAG GAT AGT GAAG G C AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C CAGAAC 154 6 



II I II 1 1 II 1 1 1 1 I II 1 1 1 II III 1 1 1 1 1 1 1 1 1 1 1 1 1 II I 1 1 II 1 1 

Db 1550 AT CAC GAAAAAGATAGT GAGAGTAGTAAT GAT GATACTT CTTT C CCCAGTACGC CAGAAG 1609 

Qy 1547 CT GT GAAG GAC AGC T C C AGAGC AT AT AT T AC CT GT GCT T C CT T T A CCTCAGCAACCG 1603 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1610 GT AT AAAG GAT C GT T CAG GAGC AT AT AT C ACAT GTGCTCCCTT T AAC C C AG C AGCAACT G 1669 

Qy 1604 AAAG CAC CAC AG C AAAC ACT T TCCCTTTGT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AG 1663 

I I I I I II I I I I I I III I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 1670 AGAGC AT T G CAACAAACAT T TT T C CTT T GT T AG GAGAT C CT AC T T CAGAAAAT AAGAC C G 1729 

Qy 1664 AT G - AAAAAAAAAT AGAAGAAAG GAAG G C C C AAAT T AT AAC AG AGAAG ACTAGCCCC 1719 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II 
Db 1730 AT GAAAAAAAAAAT AGAAGAAAAGAAG GC C CAAAT AGT AAC AGAGAAGAAT AC T AGC AC C 17 8 9 

Qy 1720 AAAAC GT CAAAT C C - TT T C C TT GT AGC AGT ACAG GAT T C T GAGGC AGAT TAT GT T ACAAC 1778 

Mill I I I I I II III I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I Mill 
Db 1790 AAAAC AT C AAAC C C T T T T AC T T GT AGC AGC AC AG GAT T C T GAG AC AGAT TAT GT CAC AAC 1849 

Qy 1779 AGAT AC CT TAT CAAAGGT G ACT GAG GC AGC AGT GT CAAAC AT GC CT GAAGGT CT GAC GC C 1838 

Mill III I I II II I II II I M I II III II I M II I I I M I I II Mill II 
Db 1850 AGATAATTTAACAAAGGT GACT GAGGAAGT CGT GGCAAACAT GCCT GAAGGCCT GACTC C 1909 

Qy 1839 AGAT T TAGT T CAGGAAGC AT GT GAAAGT GAACT GAAT GAAG C C AC AGGT ACAAAGAT T GC 189 8 

II II I I I I I I I II I M II I I I II I I I I II I II I I I II I I I I I I II II I I II I I I I 

Db 1910 AGAT T TAGT ACAGGAAGC AT GT GAAAGT GAAT T GAAT GAAGT TACT GGT ACAAAGAT T GC 1969 

Qy 18 99 TT AT GAAACAAAAGT GGACTTGGT C CAAACAT CAGAAGCT ATACAAGAAT CACTTTACCC 1958 

I I II I I I I I I I I I II I II I I I II I I II I I I I I I I I I Ml I II I I Mill II II 
Db 1970 TTATGAAACAAAAATGGACTTGGTT CAAACATCAGAAGTTATGCAAGAGTCACT CTATCC 2029 

Qy 1959 CAC AGCAC AGCT T T GC C CAT CAT T T GAG GAAG C T GAAG C AACT C C GT CAC CAGT T T T G C C 2018 

I I I I I I II I I II I II II M I II I I I II I I I I I I I I II I I II II I I M II I M 
Db 2030 T GC AGCAC AGCT T T GCC CAT CAT T T GAAGAGT CAGAAGCT AC T C C T T CAC C AGTT T T G C C 2089 

Qy 2019 TGATATTGTTATGGAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGT 2 078 

III I II I I I I I II I II II I II II I I II I I I I II M I II I I II II I I II I 

Db 2090 TGACATTGTTATGGAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGAT 2149 

Qy 2079 GC AGC C CAGT GT AT C CC CAC T GGAAGC AC C T C CT C CAGT TAGT TAT GAC AGT AT AAAG CT 213 8 

I I I I I I I I Ml III I II I I II II I I II I I I I II I I II I I II I I 
Db 2150 AC AGC C C AGCT CAT CAC CAT T AGAAG CTT CT T CAGTTAAT TAT G AAAG CAT AAAAC A 2206 

Qy 2139 T GAGC CT GAAAAC C C CC C AC CAT AT GAAGAAGC C AT GAAT GT AGCAC T AAAAGCTTT 2195 

I I II I II M I II I I I I II M II I I I II II I I II I I I I MM MM I II II I 

Db 2207 T GAGC CT GAAAACCC CC CACCATAT GAAGAGGCCATGAGT GT AT CACTAAAAAAAGT AT C 2266 

Qy 2196 GGGAAC AAAGGAAGGAATAAAAGAGCCT GAAAGT T T TAAT GC AGCT GT T CAG GAAAC AGA 2255 

I II I I I M I II I III I I I II I I I I M I I I I I I I I M II I I I I II II I I I M I 
Db 22 67 AGGAATAAAGGAAGAAATTAAAGAGCCT GAAAATATTAAT GCAGCT CTT CAAGAAACAGA 2 32 6 

Qy 2256 AG CT C CT T AT AT AT C CAT T G CGT GT GAT T TAAT T AAAGAAACAAAGC T CT C C ACT GAG C C 2315 

M II I II I I I I I II I I I I I I I I I M II I II II II I M li I II M I I II I I M M 
Db 2327 AG CT C CT TAT AT AT CT AT T GCAT GT GAT T TAAT T AAAGAAACAAAGC T T T CT G CT GAAC C 238 6 



Qy 



2316 AAGT C CAG AT T T CT C T AATT AT T CAGAAAT AGCAAAAT T C GAGAAGT CGGTGCCC GAAC A 2375 
I I II II I I I II I I I I I I I I I I I I I II I II I II I II I I I I I I I I M I I 



Db 238 7 AGC T C C G GAT T T CT CT GAT TAT T CAGAAAT G G CAAAAGTT GAAC AG C C AGT G C CT GAT C A 244 6 

Qy 2 376 C GCT GAGCT AGT GGAGGAT T C C T CAC CT GAAT C T GAAC C AGT T GACT T AT T T AGT GAT G A 24 35 

I I I I I I I 1 I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I 
Db 2447 T T CT GAG CT AGT T GAAGAT T CCT CAC CT GAT T CT GAAC C AGT T GACT TAT T TAGT GAT GA 2506 

Qy 2436 T T C GAT T C CT GAAGT C C C ACAAAC ACAAGAGGAG GCT GT GAT GCT CAT GAAGGAGAGT C T 2495 

II I II I I I I I II I I I II I I I I I I I I II I I I I I I I I I I I I I I II I I I I I 
Db 2 507 T T CAAT AC C T GACGT T C CAC AAAAACAAGAT GAAACT GT GAT GC T T GT GAAAGAAAGT CT 2566 

Qy 2 4 96 CACTGA AGT GT CT GAGACAGT AGC C C AGCACAAAGAGGAGAGACT T AGT GC CT C 2549 

I I I I I I II I III I III I I I I I I I I I I I I I I 

Db 2567 CACTGAGACTTCATTTGAGTCAATGATAGAATATGAAAATAAGGAAAAACTCAGTGCTTT 2626 

Qy 2550 ACCTCAGGAGCTAGGAAAGCCATATTTAGAGT CTTTT CAGCCCAATTTACATAGTACAAA 2609 

II I III I II I I I I I I I I I I I I II I I I I I I III II I I I I III I I II I 
Db 2627 G C C AC CT GAGG GAG GAAAGC C AT AT TT GGAAT CT T T T AAG C T CAGT T T AGAT AAC AC AAA 268 6 

Qy 2610 AGATGC T GCAT CTAAT GACAT T CCAACATT GACCAAAAAGGAGAAAATTT CTTTGCA 2666 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 68 7 AGAT AC C CT GT T AC C T GAT GAAGT T T CAAC AT T GAGC AAAAAGGAGAAAAT T C CT TT GC A 274 6 

Qy 2 667 AAT G GAAGAGT T T AAT AC T GCAAT T TAT T CAAAT GAT GACT TACT T T CT T CT AAGGAAGA 272 6 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I 

Db 2747 GAT GGAGGAG CT CAGT AC T GC AGT T TAT T CAAAT GAT GACT T ATT TAT T T C T AAGGAAG C 2806 

Qy 2727 C AAAAT AAAAGAAAGT GAAAC AT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T 2786 

I I I I I I I I I I MINI I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II 
Db 2 8 07 ACAGATAAGAGAAACTGAAACGTTTTCAGATT CATCT CCAATTGAAATTATAGAT GAGTT 2 8 66 

Qy 27 87 T C C CAC GT T T GT CAGT G CT AAAGAT GAT T C T C CTAAATT AGC CAAGGAGTACACT GA 2 8 43 

II II II Mill I I I I I I I I I I I I I I II I I I I I I I I III II I I I I I 

Db 28 67 C C CT ACAT T GAT C AGT T CT AAAAC T GAT T CAT T T T CT AAAT T AGC C AGG GAAT AT ACT G A 2926 

Qy 2844 T CT AGAAGT AT C CGACAAAAGT GAAAT T GCTAAT AT C CAAAGC GGGG C AGATT C AT T G C C 2903 

I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II II II I I I I I I I I I 

Db 2 927 C CT AGAAGT AT C CCACAAAAGT GAAAT T GCTAAT G C C C C GGAT GGAG CT GGGT CAT T GC C 2986 

Qy 2904 TT GCT TAGAATTGCCCTGTGACCTTTCTTTCAAGAAT AT ATAT CCT AAAGAT GAAG 2959 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I III I II I I I I I I I I I 
Db 2 987 T T GC AC AGAAT T GC C C CAT GAC CT T T CT T T GAAGAAC AT ACAAC C CAAAGT T GAAGAGAA 304 6 

Qy 2960 — T ACAT GT T T C AGAT GAAT T CT C C GAAAAT AGGT C CAGT GT AT C T AAG G CAT C CAT AT C 3017 

I I I I I I I I I I I I I I I I I I I I II It I I I I I I I I MM 
Db 3047 AATCAGTTTCTCAGATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATT 3106 

Qy 3018 G C C T T CAAAT GT CTCTGCTTT GGAACC T C AGACAGAAAT GG GCAG CAT AGT TAAAT CC AA 3077 

I I II I I I II I I I I I I I II I I I II I I I II II I II I I I I II II I I I M I 
Db 3107 G C C T C C AGAT GTTTCTGCTTTGGC CAC T CAAG CAGAGAT AGAGAG CAT AGT TAAAC CCAA 3166 

Qy 3078 AT C ACT T AC GAAAGAAGC AGAGAAAAAACT T C CT T CT GAC AC AG AG AAAG AG GAC AGAT C 3137 

I I II I I I I I II I I II I I I I I II I I I I I I I I II I I I I I I I I I I II I I I I M I 
Db 3167 AGT T CT T GT GAAAGAAG C T GAGAAAAAACT T C CT T C C GAT AC AGAAAAAGAGGAC AGAT C 3226 

Qy 3138 CCTGTCAGCTGTATTGTCAGCAGAGCTGAG-TAAAACTTCAGTTGTTGACCTCCTCTACT 3196 

I II III II I I II I II II I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I M I 
Db 3227 AC CAT CT GCT AT AT T T T CAG C AGAG CT GAGCT AAAAC T T CAGT T GT T GAC C T C CT GTAC T 32 8 6 



Qy 

Db 



3197 
3287 



3256 



3346 



Qy 3257 TGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGA 3316 

I I I I I I I I I I I I I I II II II I I I I I I II I I I I I I I I II I II I I I I M I M I | | | I 
Db 3347 TGACAGTATTCAGCATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGA 3406 

Qy 3317 CT AT C AG CT T T AGGAT AT AT AAG GGC GT GAT C C AGGC T AT C C AGAAAT C AGAT GAAGGC C 3376 

I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 
Db 34 07 CC AT CAGC T T TAG GAT AT ACAAGGGT GT GAT C CAAGC TAT C C AGAAAT C AGAT GAAGGC C 3466 

Qy 3377 ACCCATT CAGGGCAT ATTTAGAAT CT GAAGTT GCTAT AT CAGAGGAATT GGTTCAGAAAT 3436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I 
Db 3467 AC C CAT T C AG GGC AT AT C T G GAAT CT GAAGT T G C TAT AT CT GAG GAGTT GGT T CAGAAGT 3526 

Qy 3437 AC AGTAATT CTGCTCTTGGT CAT GT GAACAGC ACAAT AAAAGAACT G AGG C GGCTTTTCT 3496 

I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 3527 ACAGTAATTCTGCTCTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCT 3586 

Qy 3497 T AGT T GAT GATT T AGT T GAT T C C CT GAAGT T T GCAGT GTT GAT GT G GGT GT T T ACTT AT G 3556 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I 
Db 3587 T AGT T GAT GAT T TAGT T GAT T CT CT GAAGT T T GCAGT GT T GAT GT GG GT AT T T AC CT AT G 3646 

Qy 3557 TTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTA 3616 

I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I II I I I I I II I I I I II I I I I II 
Db 3647 TTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTG 3706 

Qy 3617 T T C C T GT TAT T TAT GAACGG C AT C AGGT GC AGAT AGAT CAT TAT CT AGGAC T T GC AAAC A 367 6 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 37 07 T T C CT GT TAT T TAT GAACGGCAT CAGGCAC AGAT AGAT CAT TAT CT AG GACT T GCAAAT A 3766 

Qy 3677 AGAGT GTT AAG GAT GC CAT GGC CAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC GCAAAG 3736 

III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II M I I I 
Db 3767 AGAAT GT T AAAG AT GCTAT GGC T AAAAT C CAAGCAAAAAT C C C T G GATT GAAGC GCAAAG 3826 

Qy 3737 CAGA 3740 

III 

Db 3827 CTGA 3830 



RESULT 2 
US-08-700-607-2 

Sequence 2, Application US/08700607 
Patent No. 5858708 
GENERAL INFORMATION: 

APPLICANT: Bandman,. Olga 
APPLICANT: Au- Young, Janice 
APPLICANT: Goli, Surya K. 
APPLICANT: Hillman, Jennifer L. 

TITLE OF INVENTION: TWO NOVEL HUMAN NSP-LIKE PROTEINS 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 
CITY: Palo Alto 



STATE: CA 
COUNTRY: U.S. 
ZIP: 94304 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 1.5 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 7 00 , 607 
FILING DATE: Filed Herewith 
ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J, 
REGISTRATION NUMBER: 3 6,74 9 
REFERENCE/DOCKET NUMBER: PF-0114 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
TELEFAX: 415-845-4166 
INFORMATION FOR SEQ ID NO: 2: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 799 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
IMMEDIATE SOURCE: 
LIBRARY: 

CLONE: Consensus 
US-08-700-607-2 

Query Match 13.3%; Score 497.4; DB 2; Length 799; 

Best Local Similarity 92.7%; Pred. No. 1.4e-106; 

Matches 522; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 

Qy 3178 GT T GT T GAC CT C CT CT ACT GGAG AGAC AT T AAGAAGAC T G GAGT G GT GT T T GGT GC C AG C 3237 

I I I I I II I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 108 GTTGTTGACCTCCTGTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGC 167 

Qy 3238 TTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCC 3297 

I I M I I I I I I I I I II I I I I I I I I I II I I II I I I II I II I I I I I I I I I I I I I I 
Db 168 CT AT TCCTGCTGCTTT CAT T GAC AGT AT T C AGC AT T GT GAG C GT AAC AG C C T AC AT T GC C 227 

Qy 32 98 TTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTATC 3357 

I M I M II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I MINI 
Db 228 TTGGCCCTGCTCTCTGTGACCATCAGCTTTAGGATATACAAGGGTGTGATCCAAGCTATC 287 

Qy 3358 C AGAAAT C AGAT GAAGG C C AC C CAT T C AG GGC AT AT T T AGAAT CT GAAGT T G C TAT AT C A 3417 

I I I I I I I M I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I M I II I I I I I II I 
Db 288 CAGAAAT C AGAT GAAGGCCAC CCATT CAGGGCAT AT CT GGAAT CT GAAGTT GCTATAT CT 34 7 

Qy 3418 GAGGAAT T GGT T CAGAAAT ACAGTAAT TCTGCTCTT G GT C AT GT GAAC AGC ACAAT AAAA 3477 

I I I I I I I I I I I I I I I I I I I I I || I I I I I I || I M I I I I I I I I I I I I I II I I II I I 
Db 34 8 GAG GAGT T GGT T CAGAAGT ACAGTAAT TCTGCTCT T GGT CAT GT GAAC T GCAC GAT AAAG 4 07 



Qy 

Db 



3478 
408 



GAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTG 3537 

I I I I I Mill II I I I I I II II I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GAACTCAGGCGCCTCTTCTTAGTTGATGATTTAGTTGATTCTCTGAAGTTTGCAGTGTTG 4 67 



Qy 


3538 


Db 


468 


Qy 


3598 


Db 


528 


Qy 


3658 


Db 


588 


Qy 


3718 


Db 


648 



ATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCT 35 97 
I I II I I I I I I I I I I I I I II II I I I I I I I I I I I I II I I I I I I I I | M II i I I I I Ml 
ATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGCT 527 

CT GAT C T C AC T CT T C AGT AT T C CT GT TAT T TAT GAAC GG CAT CAGGT GC AGAT AGAT CAT 3657 

I I M I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I | | I | | | | I I I I I I I I | | M 

CT CAT T T C ACT CT T C AGT GT T C CT GTT AT T TAT GAAC GGC AT C AGG C AC AGAT AGAT CAT 587 

TAT CT AGGACT T G CAAACAAGAGT GTTAAGGAT G C CAT GGC CAAAAT C CAAGCAAAAAT C 3717 
I I I I I M I I II I I II I I I I II I I I I I I I II I I II I I I I I I I I I I I I I I I II I I I I 
TAT C T AGGAC T T G CAAAT AAGAAT GTT AAAGAT GCT AT GGC T AAAAT C CAAGCAAAAAT C 647 

CCTGGATTGAAGCGCAAAGCAGA 3740 
I I I I I I II I I I I I I I II I II II 
CCTGGATTGAAGCGCAAAGCTGA 670 



RESULT 3 

US-09-023-655-382 

; Sequence 382, Application US/09023655 

; Patent No. 6607879 

; GENERAL INFORMATION: 

; APPLICANT: Cocks, Benjamin G. 

APPLICANT: Susan G. Stuart 

APPLICANT: Jeffrey J. Seilhamer 

TITLE OF INVENTION: COMPOSITION FOR THE DETECTION OF BLOOD CELL GENE 
; TITLE OF INVENTION: EXPRESSION 

; NUMBER OF SEQUENCES: 1508 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: INCYTE PHARMACEUTICALS, INC. 

STREET: 3174 PORTER DRIVE . 

CITY: PALO ALTO 

STATE : CALI FORNI A 

COUNTRY: USA 

ZIP: 94304 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Word Perfect 6.1 for Windows /MS-DOS 6.2 
; CURRENT APPLICATION DATA: 

; APPLICATION NUMBER: US/09/023,655 

; FILING DATE: HEREWITH 

CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 

FILING DATE: 

CLASSIFICATION: 
ATTORNEY/AGENT INFORMATION: 
; NAME: Zeller, Karen J. 

REGISTRATION NUMBER: 37,071 

REFERENCE/ DOCKET NUMBER: PA- 0001 US 
; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (650) 855-0555 

TELEFAX: (650) 845-4166 
; INFORMATION FOR SEQ ID NO: 382: 



SEQUENCE CHARACTERISTICS: 
; LENGTH: 2610 base pairs 

TYPE: nucleic acid 
STRANDEDNESS: single 
TOPOLOGY: linear 
IMMEDIATE SOURCE: 
; LIBRARY: LUNGNOT14 

CLONE: 1508778 
US-09-023-655-382 

Query Match 13.0%; Score 484.8; DB 4; Length 2610; 

Best Local Similarity 92.4%; Pred. No. 2.4e-103; 

Matches 521; Conservative 0; Mismatches 42; Indels 1; Gaps 1; 

Qy 3178 GTTGTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGC 3237 

M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I! I I 
Do 1311 GT T GT T GAC CT C CT GT ACT G GAGAGAC AT T AAGAAGACT G GAGT GGTGTTTGGTGC C AG C 1370 

Qy 3238 TTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTAC-ATTGC 32 96 

I I I I M I I I I I I I II II I I I I I I I I M I || I | | || MMI I I I I I I I I -I I I 
Db 1371 CTATTCCTGCTGCTTTCATTGACAGTATTCAGCATTGTGAGCGTAACAGCCTACAATTGC 14 30 

Qy 3297 CTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTAT 3356 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 
Db 1431 CTTGGCCCTGCTCTCTGT GAC CAT C AGC T T TAG GAT AT ACAAG G GT GT GAT C CAAG CT AT 1490 

Qy 3357 C C AGAAAT C AGAT GAAGG C C AC C C ATT C AGGGC AT AT T T AGAAT CT GAAGT T GCT AT AT C 3416 

I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I M I II I I I I I I I I I I I I I I I I I I I M 
Db 14 91 C C AGAAAT C AGAT GAAG G C C AC C CAT T C AGG GC AT AT C T GGAAT C T GAAGT T G CT AT AT C 1550 

Qy 3417 AGAG GAAT T G GT T C AGAAAT AC AGTAAT TCTGCTCTT G GT C AT GT GAAC AGC ACAAT AAA 34 76 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 1551 T GAGGAGT T GGT T CAGAAGT AC AGTAAT TCTGCTCTT GGT CAT GT GAAC T GCAC GAT AAA 1610 

Qy 34 77 AGAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTT 3536 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I || | | | | | | | | | | | || | | | | || 
Db 1611 GGAACTCAGGCGCCTCTTCTTAGTTGATGATTTAGTTGATTCTCTGAAGTTTGCAGTGTT 1670 

Qy 3537 GATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGC 3596 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I II 
Db 1671 GATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGC 1730 

Qy 3597 T CT GAT CT CAC T CT T C AGT ATT C CT GT TAT T TAT GAAC GGCAT C AGGT GC AGAT AGAT C A 3656 

Ml II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I 
Db 1731 T CT CAT T T CACT CT T C AGT GT T C CT GT T AT T TAT GAAC G G CAT C AGGC ACAGATAGAT C A 17 90 

Qy 3657 T TAT C TAG GAC TTGCAAACAAGAGTGTTAAG GAT GC CAT GGCCAAAATC CAAG CAAAAAT 3716 

I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 17 91 TT AT CT AGGACTT GCAAATAAGAAT GTTAAAGAT GCTAT GGCTAAAAT C CAAGCAAAAAT 1850 

Qy 3717 C C C T G GAT T GAAG C G C AAAG C AG A 3740 

I I I I I I I I I I I I I I II I I I I II 
Db 1851 CCCTGGGTTGAAGCGCAAAGCTGA 1874 

RESULT 4 

US-09-149-476-254 



Sequence 254, Application US/09149476 
Patent No. 6420526 
GENERAL INFORMATION: 
APPLICANT: Rosen et al . 

TITLE OF INVENTION: 186 Human Secreted proteins 
FILE REFERENCE: P2002P1 

CURRENT APPLICATION NUMBER: US/09/149, 476 

CURRENT FILING DATE: 1998-09-08 

EARLIER APPLICATION NUMBER: PCT/US 98/04493 

EARLIER FILING DATE: 1998-03-06 

EARLIER APPLICATION NUMBER: 60/040,162 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/040,333 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/038,621 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/040,626 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/040,334 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/040,336 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/040,163 

EARLIER FILING DATE: 1997-03-07 

EARLIER APPLICATION NUMBER: 60/047,600 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,615 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,597 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,502 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,633 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,583 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,617 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,618 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,503 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,592 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,581 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,584 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,500 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,587 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,492 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,598 

EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,613 



EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047, 582 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,596 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,612 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,632 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,601 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,580 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,568 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,314 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,569 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,311 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,671 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,674 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,669 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,312 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,313 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,672 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,315 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/048,974 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/056,886 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,877 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,889 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,893 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,630 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,878 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,662 
EARLIER FILING DATE : 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,872 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,882 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,637 
EARLIER FILING DATE: 1997-08-22 



EARLIER APPLICATION NUMBER: 60/056,903 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,888 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,879 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,880 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,894 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,911 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,636 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,874 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,910 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,864 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,631 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,845 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,892 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,761 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/047,595 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,599 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,588 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,585 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,586 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,590 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,594 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,589 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,593 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,614 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,578 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,576 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/047,501 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,670 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/056,632 



EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,664 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,876 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,881 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,909 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,875 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,862 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,887 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,908 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/048,964 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/057,650 
EARLIER FILING DATE: 1997-09-05 
EARLIER APPLICATION NUMBER: 60/056,884 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,669 
EARLIER FILING DATE: 1997-09-05 
EARLIER APPLICATION NUMBER: 60/049,610 
EARLIER FILING DATE: 1997-06-13 
EARLIER APPLICATION NUMBER: 60/061,060 
EARLIER FILING DATE: 1997-10-02 



Query Match 6.1%; Score 228.8; DB 4; Length 1766; 

Best Local Similarity 63.4%; Pred. No. 1.3e-43; 

Matches 350; Conservative 0; Mismatches 202; Indels 0; Gaps 0; 



Qy 


3174 


T T C AGT T GT T GAC C T C CT CT AC T G GAGAGACAT TAAGAAGACT GGAGT G GT GT T T GGT G C 

IIM MM 1 1 M M 1 M M 1 M 1 M M M 1 1 1 M M M 1 1 

TGCGGTGCACGATCTGATTTTCTGGAGAGATGTGAAGAAGACTGGGTTTGTCTTTGGCAC 


3233 


Db 


286 


345 


Qy 


3234 


CAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACAT 

M 1 M M 1 M M M M 1 M 1 M M 1 1 1 M M M 1 M 1 1 

CACGCTGATCATGCTGCTTTCCCTGGCAGCTTTCAGTGTCATCAGTGTGGTTTCTTACCT 


3293 


Db 


346 


405 


Qy 


3294 


TGCCTTGGCCCTGCTCTCGGT GAC TAT CAGCT T T AGGAT AT AT AAG GGC GT GAT C C AG GC 

1 MM 1 1 M M 1 M M M M M M 1 M M M 1 1 1 ! 1 1 Mill M 

CATCCTGGCTCTTCTCTCTGTCACCATCAGCTTCAGGATCTACAAGTCCGTCATCCAAGC 


3353 


Db 


406 


465 


Qy 


3354 


TAT C C AGAAAT C AGAT GAAG GC CAC C CAT T C AGG GC AT ATT T AGAAT C T GAAGT T G CT AT 

1 1 M M 1 M M 1 1 M M M 1 1 M M M MM Ml M M M 1 
T GT AC AGAAGT C AGAAGAAGG C CAT C CAT T CAAAGC CT AC CT GGAC GT AGAC AT TACT CT 


3413 


Db 


466 


525 


Qy 


3414 


AT CAG AG GAAT T GGT T CAGAAAT ACAGTAAT TCTGCTCTT GGT CAT GT GAAC AGC ACAAT 

II Ml 1 M M M 1 1 1 1 || M 1 1 II 1 1 1 1 

GT C C T C AGAAGC T TT C C ATAAT T AC AT GAAT GC T GC C AT G GT G CAC AT CAACAG GG C C C T 


3473 


Db 


526 


585 


Qy 


3474 


AAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGT 
HI M II 1 1 M 1 II II 1 M 1 M 1 II II 1 1 1 1 II I I 1 1 1 
GAAAC T CAT TAT T C GT CT CTTTCTGGTAGAAGATCT GGT T GAC TCCTT GAAG CTGGCTGT 


3533 


Db 


586 


645 



Qy 3534 GTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTT 3593 

I I I I M I I I I I I I I I I I I I I I I I I I I I | | | | I I I | | | | | | | | 
Db 646 CTTCATGTGGCTGATGACCTATGTTGGTGCTGTTTTTAACGGAATCACCCTTCTAATTCT 705 

Qy 3594 AG CT C T GAT C T CACT CT T C AGT AT T C CT GTT AT T TAT GAAC G G CAT C AG GT GCAGAT AGA 3653 

IN I I I I I I I I I I I I I I I I I M || || | | | | | | | 

°k 706 T GCT GAACT G CT CAT T T T C AGT GT C C C GATT GT CT AT GAGAAGT ACAAGAC C C AGATT GA 7 65 

Qy 3654 T CAT TAT CT AGGACT T G CAAACAAGAGT GTT AAGGAT GC C AT G GC CAAAAT C CAAGC AAA 3713 

i i I ii I ill ii M 1 1 1 1 ; I I ' J I I 

Db 766 T C AC TAT GT T GGCAT C G C C C GAGAT CAGAC CAAGT CAAT T GT T GAAAAGAT C CAAGC AAA 825 

Qy 3714 AATCCCTGGATT 3725 

I I I I I I I I I I 
Db 826 ACTCCCTGGAAT 837 



RESULT 5 

US-09-149-476-255 

; Sequence 255, Application US/09149476 

; Patent No. 6420526 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: 186 Human Secreted proteins 
; FILE REFERENCE: PZ002P1 

CURRENT APPLICATION NUMBER: US/09/149,476 

CURRENT FILING DATE: 1998-09-08 

EARLIER APPLICATION NUMBER: PCT/US98/04493 
; EARLIER FILING DATE: 1998-03-06 
; EARLIER APPLICATION NUMBER: 60/040,162 

EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,333 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/038,621 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,626 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,334 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,336 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,163 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/047,600 
; EARLIER FILING DATE: 1997-05-23 
; EARLIER APPLICATION NUMBER: 60/047,615 
; EARLIER FILING DATE: 1997-05-23 

EARLIER APPLICATION NUMBER: 60/047,597 
; EARLIER FILING DATE: 1997-05-23 
; EARLIER APPLICATION NUMBER: 60/047,502 
; EARLIER FILING DATE: 1997-05-23 
; EARLIER APPLICATION NUMBER: 60/047,633 
; EARLIER FILING DATE: 1997-05-23 
; EARLIER APPLICATION NUMBER: 60/047,583 
; EARLIER FILING DATE: 1997-05-23 
; EARLIER APPLICATION NUMBER: 60/047,617 



EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,618 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,503 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,592 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,581 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,584 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,500 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,587 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,492 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,598 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,613 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,582 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,596 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,612 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,632 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,601 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,580 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,568 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,314 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,569 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,311 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,671 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,674 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,669 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,312 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,313 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,672 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,315 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/048,974 
EARLIER FILING DATE: 1997-06-06 



EARLIER APPLICATION NUMBER: 60/056,886 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,877 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,889 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,893 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,630 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,878 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,662 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,872 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,882 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,637 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,903 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,888 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,879 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,880 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,894 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,911 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,636 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,874 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,910 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,864 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,631 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,845 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,892 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,761 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/047,595 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,599 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,588 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,585 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,586 



; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/047,590 

; EARLIER FILING DATE: 1997-05-23 

/ EARLIER APPLICATION NUMBER: 60/047,594 

; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/047,589 

; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/047,593 

; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/047,614 

; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/043,578 

; EARLIER FILING DATE: 1997-04-11 

; EARLIER APPLICATION NUMBER: 60/043,576 

; EARLIER FILING DATE: 1997-04-11 

; EARLIER APPLICATION NUMBER: 60/047,501 

; EARLIER FILING DATE: 1997-05-23 

; EARLIER APPLICATION NUMBER: 60/043,670 

; EARLIER FILING DATE: 1997-04-11 

; EARLIER APPLICATION NUMBER: 60/056,632 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,664 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,876 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,881 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,909 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,875 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,862 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,887 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/056,908 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/048,964 

; EARLIER FILING DATE: 1997-06-06 

; EARLIER APPLICATION NUMBER: 60/057,650 

; EARLIER FILING DATE: 1997-09-05 

; EARLIER APPLICATION NUMBER: 60/056,884 

; EARLIER FILING DATE: 1997-08-22 

; EARLIER APPLICATION NUMBER: 60/057,669 

; EARLIER FILING DATE: 1997-09-05 

; EARLIER APPLICATION NUMBER: 60/049,610 

; EARLIER FILING DATE: 1997-06-13 

; EARLIER APPLICATION NUMBER: 60/061,060 

; EARLIER FILING DATE: 1997-10-02 

Query Match 6.1%; Score 228.8; DB 4; Length 2664; 

Best Local Similarity 63.4%; Pred. No. 1.6e-43; 

Matches 350; Conservative 0; Mismatches 202; Indels 0; Gaps 0; 

Qy 3174 TTCAGTTGTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGC 3233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II Mill I 

Db 261 T G C G GT GCAC GAT CT GAT T T T CT G GAGAGAT GT GAAGAAGACT GGGTTTGTCTTTG GC AC 320 



Qy 3234 CAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACAT 32 93 

M I I I I I I I I I I I I I I I I I I I I I II I | M I I I I I I I I I 

Db 321 CACGCTGATCATGCTGCTTTCCCTGGCAGCTTTCAGTGTCATCAGTGTGGTTTCTTACCT 380 

Qy 3294 TGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGC 3353 

I I I I I M I I I I I II II I I I I I II I I I I I I II I I I I I I M I I I II 
Db 381 CAT CCTGGCTCTTCTCTCTGT C AC CAT C AG C TT C AGGAT CT ACAAGT C C GT C AT C CAAGC 44 0 

Qy 3354 TAT C C AGAAAT C AGAT GAAGG C C AC C CAT T C AGGGC AT AT T T AGAAT C T GAAGTT GCT AT 3413 

I I M I I I I II I I II I I I I II I I I I I I I I I I I IN || Mill 

Db 441 T GT ACAGAAGT CAGAAGAAG GC C AT C CAT T CAAAGC C T AC C T G GAC GT AGAC AT TACT CT 500 

Qy 3414 AT CAGAGGAATT GGT T CAGAAAT AC AGTAAT TCTGCTCTTGGT CAT GT GAAC AGC ACAAT 3473 

M Ml I I I I I I I I I I I I I I I I II I I I I I I I I | | 

Db 501 GT C CT C AGAAGC TT T C C AT AAT T ACAT GAAT GCT G CC AT G GT GCAC AT CAAC AGG GC C CT 560 

Qy 3474 AAAAGAACT GAG GCGGCTTTT CT T AGT T GAT GAT T T AGT T GAT T C C C T GAAGT TT GC AGT 3533 

Ml II I I M I I I I I M I II I I I I I I I I I I I I I | | | | M 

Db 561 GAAACTCATTATTCGTCTCTTTCTGGTAGAAGATCTGGTTGACTCCTTGAAGCTGGCTGT 620 

Qy 3534 GTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTT 3593 

I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I 
Db 621 CTTCATGTGGCTGATGACCTATGTTGGTGCTGTTTTTAACGGAATCACCCTTCTAATTCT 680 

Qy 3594 AGC T C T GAT CT C ACT C TT C AGT AT T C CT GT TAT T TAT GAAC GGC AT C AGGT GC AGAT AGA 3653 

Ml I I I I I I II I I I II I I I I I I II || I I I II I I 

Db 681 T GCT GAACT GCT CATTTT CAGT GT C C CGATT GT CTAT GAGAAGT ACAAGAC CCAGATT GA 74 0 

Qy 3654 T CAT TAT CT AG GACT T GC AAACAAGAGT GT T AAGGAT GC C AT GGC CAAAAT C CAAG CAAA 3713 

III IN I II IN I Ml I I I I I I I I I I I I I I I 

Db 741 T C ACT AT GT T GGC AT C GC C C GAGAT C AGAC C AAGT CAAT T GT T GAAAAGAT C CAAGCAAA 800 

Qy 3714 AATCCCTGGATT 3725 

I I I I I I I I I I 
Db 801 ACTCCCTGGAAT 812 



RESULT 6 
US-08-700-607-4 

; Sequence 4, Application US/08700607 
;' Patent No. 5858708 

GENERAL INFORMATION: 

APPLICANT: Bandman, Olga 
APPLICANT: Au-Young, Janice 
APPLICANT: Goli, Surya K. 
APPLICANT: Hillman, Jennifer L. 
; TITLE OF INVENTION: TWO NOVEL HUMAN NSP-LIKE PROTEINS 

NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 
; CITY: Palo Alto 

STATE: CA 
COUNTRY: U.S. 
ZIP: 94304 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
; OPERATING SYSTEM: DOS 

SOFTWARE: FastSEQ Version 1.5 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/700,607 

FILING DATE: Filed Herewith 
; ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J. 
REGISTRATION NUMBER: 36,749 
REFERENCE/DOCKET NUMBER: PF-0114 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
; TELEFAX: 415-845-4166 

INFORMATION FOR SEQ ID NO: 4: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 1095 base pairs 

TYPE: nucleic acid 
STRANDEDNESS: single 
; TOPOLOGY: linear 

MOLECULE TYPE: cDNA 
IMMEDIATE SOURCE: 
; LIBRARY: THP1NOB01 

CLONE: 31870 
US-08-700-607-4 

Query Match 5.4%; Score 203.6; DB 2; Length 1095; 

Best Local Similarity 61.6%; Pred. No. 7.6e-38; 

Matches 337; Conservative 1; Mismatches 208; Indels 1; Gaps 1; 

Qy 3174 T T C AGT T GT T GAC C T C CT CT ACT GGAGAGAC AT T AAGAAGACT G GAGT G GT GT T T GGT GC 3233 

I I M I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I | | | | 

Db 32 8 TGCGGTGCACGATCTGATTTTMTGGAGAGATGTGAAGAAGACTGGGTTTGTCTTTGGCAC 387 

Qy 3234 CAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACAT 3293 

M I I I I 1 I I I I I II I I I I I I Mill I I I I I I I I I I I I I 

Db 38 8 CACGCTGATCATGCTGCTTTCCCTGGCAGCTTTCAGTGTCATCAGTGTGGTTTCTTACCT 447 

Qy 3294 TGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGC 3353 

I I I I I M I I I I I M II I I I I II I I I I I I I II III II I I I I I I II 
Db 448 CATCCTGGCTCTTCTCTCTGTCACCATCAGCTTCAGGATCTACAAGTCCGTCATCCAAGC 507 

Qy 3354 TATCCAGAAATCAGATGAAGGCCACCCATTCAGGGCATATTTAGAATCTGAAGTTGCTAT 3413 

I I I I I M I I M I I I I I I I I I I I I I I I I I I I I IN || | | | | | 

Db 508 T GT ACAGAAGT CAGAAGAAGGC C AT C CAT T CAAAGC CT AC C T GGAC GT AGAC AT TACT CT 567 

Qy 3414 AT CAGAG GAAT T G GTT C AGAAAT AC AGTAAT TCTGCTCTTGGT CAT GT GAACAGCACAAT 3473 

II Ml I I I I I I I I I I II I I II I I || I I I I I I I I 

Db 568 GT C CT CAGAAG CT T T C CAT AAT T AC AT GAAT G C T G C CAT GGT G C AC AT CAAC AGG G C C CT 627 

Qy 3474 AAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGT 3533 

Ml M I I I I I I I I I I I I I I I M I || I | | | | | || | | | || 

Db 62 8 GAAACTCATTATTCGTCTCTTTCTGGTAGAAGATCTGGTTGACTCCTTGAAGCTGGCTGT 687 

Qy 3534 GTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTT 3593 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I II I II II || III I 
Db 68 8 CTTCATGTGGCTGATGACCTATGTTGGTGCTGTTTTTAACGGAATCACCCTTCTAATTCT 747 



Qy 



3594 AG C T CT GAT CT CAC T CTT C AGT AT T CC T GT TAT T TAT GAAC GG CAT C AG GT G C AG AT AGA 3 653 



Db 



74 8 TGCTGAACTGCTCATTTTNAGTGTCCCGATTGTNTATNAGAAGTACAAGGTTC-CAAGCA 8 06 



Qy 



3654 T CAT TAT CT AG GACT T GCAAACAAGAGT GT T AAG GAT G C CAT G GC CAAAAT C CAAGC AAA 3713 



Db 



8 07 AAAC T C C CT GGAAT C GC CAAAAAAAAG GC AGAAT AAGT AC AT G GAAAC C AGAAAT GCAAC 866 



Qy 



3714 AATCCCT 3720 



Db 



8 67 AGT TACT 873 



RESULT 7 

US-09-621-976-740 

; Sequence 740, Application US/09621976 
; Patent No. 6639063 
; GENERAL INFORMATION: 

; APPLICANT: Dumas Milne Edwards, J.B. 

; APPLICANT: Jobert, S. 

; APPLICANT: Giordano, J.Y. 

; TITLE OF INVENTION: ESTs and Encoded Human Proteins. 

; FILE REFERENCE: GENSET . 054PR2 

; CURRENT APPLICATION NUMBER: US/ 09/621, 976 

; CURRENT FILING DATE: 2000-07-21 

; NUMBER OF SEQ ID NOS : 19335 

; SOFTWARE: Patent. pm 

; SEQ ID NO 740 

LENGTH: 454 

TYPE: DNA 
; ORGANISM: Homo sapiens 
; FEATURE : 

NAME/ KEY: CDS 

LOCATION: 22 9.. 4 53 
US-09-621-976-740 

Query Match 5.3%; Score 198; DB 4; Length 454; 

Best Local Similarity 71.8%; Pred. No. 9.5e-37; 

Matches 359; Conservative 7; Mismatches 72; Indels 62; Gaps 6; 

Qy 19 GGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAGGCAGCAGAA 7 8 

I I I I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I 

Db 10 GGCGGCGGCGGCAAGTGGGGACAGGGCGGGTGGCGCATCACCGGCGCGSAGGCAGGAGGA 69 

Qy 79 GCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGCTCGGCACGA 138 

I II I I I I I I I I I I I II I I I I I I I I I I II I I I I M I I I I I 
Db 7 0 GCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCGG 114 

Qy 139 CTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACTCTGAG 198 

M I I I III I I I I I I I M I I I I I II I I I I I I I I I I I I I II II I I I I I I I I I 
Db 115 CTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCTCTGAG 174 

Qy 199 GAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCATGGA 257 

I M I I I I I I I I I I I I I I I I | | | | | | | | | I I I I I I I I I I I II I I I 
Db 175 ACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCCATGGA 233 



Qy 


258 


Db 


234 


Qy 


318 


Db 


291 


Qy 


378 


Db 


334 


Qy 


438 


Db 


375 


Qy 


492 


Db 


435 



AGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTCCGCC 317 
M I I I M I I I I I I I I I I I I M I I I I I I | | | | | | | | | | | | M | | : | | | | 
AGACCTGGACCAGTCTCCTCTGGT— CTCGTCCTCGGACAGCCCACCCCGGCMGCAGCC 290 

CGC CTT CAAGTACCAGTT C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAGGAGGAGGAGGA 377 
Ml I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I II I I I I I 

C GC GT T CAAGT AC CAGT T C RT GAGG GAGC C C GAG GAC GAG GAG 333 

GGACGAGGAGGAGGAC GAC GAGGAC CTAGAGGAACT G GAG GT GCT GGAGAGGAAGCCCGC 4 37 

M I I I I I Mill I II I II II I M II I I M I I II M I M 

GAAG AC C T GGAGGAGCT G GAGGT GCT GGAGAGGAAG C CC GC 374 



I I M I I I I I I : I I I II I || | | I II I I : I | : | I I I I II III MM 



Mill II II I I II 



RESULT 8 

US-09-621-976-741 

; Sequence 741, Application US/09621976 
; Patent No. 6639063 
; GENERAL INFORMATION: 

; APPLICANT: Dumas Milne Edwards, J.B. 

; APPLICANT: Jobert, S. 

; APPLICANT: Giordano, J.Y. 

; TITLE OF INVENTION: ESTs and Encoded Human Proteins. 
; FILE REFERENCE: GENSET . 054PR2 

CURRENT APPLICATION NUMBER: US/ 0 9/ 62 1 , 97 6 
CURRENT FILING DATE: 2000-07-21 
; NUMBER OF SEQ ID NOS : 19335 
; SOFTWARE: Patent. pm 
; SEQ ID NO 741 
LENGTH: 463 
TYPE: DNA 

ORGANISM: Homo sapiens 

FEATURE : 
; NAME/KEY: CDS 

LOCATION: 237. .4 61 

NAME/KEY: misc_feature 

LOCATION: 2 0 
; OTHER INFORMATION: n=a, g, c or t 
US-09-621-976-741 

Query Match 5.3%; Score 196.6; DB 4; Length 463; 

Best Local Similarity 70.2%; Pred. No. 2e-36; 

Matches 358; Conservative 12; Mismatches 78; Indels 62; Gaps 6; 

Qy 10 CTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAG 69 

M I : :: M M II M I II II II I II I I II I I II I II I I MUM 
Db 9 CTCATCTRGBRNRGCGGCGGCAAGTGGGGACAGGGCGGGTGGCGCATCACCGGCGCGGAG 68 

Qy 70 GCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGC 129 
I I I I I I I M I I I M I I II I I II II I I I I || I I I | | | | | | I I 



Db 69 GCAGGAGGAGCAGTCTCATTGTTCCGGGAGCCGTCACCACAGTAGGTCCCTCGG-- 122 

QY 130 TCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGC 18 9 

IN I I Ml I I II I I I I I I I I I I I I I I I II I | | M | M I II I I 

Db 123 CTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGC 173 

QY 190 GACTCTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCC 24 8 

I I I I I I I I I II I I I I I II I I I I I M I I I I I I I I I I I I I I I | I I 
Db 174 GGCTCTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCC 2 32 

Qy 249 AGCCATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCG 3 08 

I I I I I I I I I I I I I I I I I I I I I I I I I | | | I I I I I I I I I I I I | | | | | | | | | | 
Db 233 AGC CAT GGAAGAC CT GGACCAGT CT C CT CT GGT CTCGTCCTCGGACAGCCCACCCCG 289 

Qy 309 GCCTCCGCCCGCCTT CAAGT AC C AGT T C GT GAC G GAG C C C GAGGAC GAGGAGGAC GAGGA 368 

N: I I I I I I I I I I I I I I I I I I I II I : I I I I I I I I I I I I I I I I M I I I I 
Db 290 G CMGCAGC C C G C GT T CAAGT AC C AGT T C RT GAGGGAGC C C GAGGAC GAGGAG 341 

Qy 369 GGAGGAGGAGGAC GAGGAGGAGGAC GAC GAGGACCTAGAGGAACT GGAGGT GCT GGAGAG 428 

M I I I I I Mill I I I I I II I I I I I I I I I I 
Db 342 GAAGACCTGGAGGAGCT GGAGGT GCT GGAGAG 373 

Qy 42 9 GAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCC 4 82 

I M : I I : I I I I I I I : I I I | | I I I I I I I | [ | : || : | | | | | | | 

D b 37 4 GAAGCCCGCMGCMGGGCTGTMCGCGGCCCCAGTGCMCACMGCMCCTGCMGCMGGCGCGCC 433 

Qy 483 GCTGCTGGACTTCAGCAGCGACTCGGTGCC 512 

I I I I I I I I I I I I I I I II MM: 
Db 4 34 CCTGATGGACTTCGGAAATGACTTCGTGCM 4 63 



RESULT 9 

US-09-149-476-102 

; Sequence 102, Application US/09149476 

; Patent No. 6420526 

; GENERAL INFORMATION: 

; APPLICANT: Rosen et al . 

; TITLE OF INVENTION: 18 6 Human Secreted proteins 
; FILE REFERENCE: PZ002P1 

; CURRENT APPLICATION NUMBER: US/ 09/ 14 9 , 47 6 

; CURRENT FILING DATE: 1998-09-08 

; EARLIER APPLICATION NUMBER: PCT/US 98/ 04 4 93 

; EARLIER FILING DATE: 1998-03-06 

; EARLIER APPLICATION NUMBER: 60/040,162 

EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,333 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/038,621 

EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,626 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,334 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,336 
; EARLIER FILING DATE: 1997-03-07 
; EARLIER APPLICATION NUMBER: 60/040,163 
; EARLIER FILING DATE: 1997-03-07 



EARLIER APPLICATION NUMBER: 60/047,600 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,615 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,597 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,502 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,633 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,583 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,617 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,618 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,503 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,592 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,581 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,584 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,500 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,587 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,492 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,598 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,613 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,582 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,596 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,612 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,632 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,601 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,580 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,568 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,314 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,569 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,311 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,671 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,674 



EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,669 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,312 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,313 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,672 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,315 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/048,974 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/056,886 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,877 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,889 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,893 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,630 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,878 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,662 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,872 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,882 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,637 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,903 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,888 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,879 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,880 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,894 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,911 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,636 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,874 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,910 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,864 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,631 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,845 
EARLIER FILING DATE: 1997-08-22 



EARLIER APPLICATION NUMBER: 60/056,892 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,761 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/047,595 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,599 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,588 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,585 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,586 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,590 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,594 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,589 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,593 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/047,614 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,578 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/043,576 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/047,501 
EARLIER FILING DATE: 1997-05-23 
EARLIER APPLICATION NUMBER: 60/043,670 
EARLIER FILING DATE: 1997-04-11 
EARLIER APPLICATION NUMBER: 60/056,632 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,664 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,876 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,881 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,909 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,875 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,862 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,887 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/056,908 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/048,964 
EARLIER FILING DATE: 1997-06-06 
EARLIER APPLICATION NUMBER: 60/057,650 
EARLIER FILING DATE: 1997-09-05 
EARLIER APPLICATION NUMBER: 60/056,884 
EARLIER FILING DATE: 1997-08-22 
EARLIER APPLICATION NUMBER: 60/057,669 



EARLIER FILING DATE: 1997-09-05 

EARLIER APPLICATION NUMBER: 60/049,610 
; EARLIER FILING DATE: 1997-06-13 
; EARLIER APPLICATION NUMBER: 60/061,060 
; EARLIER FILING DATE: 1997-10-02 

Query Match 4.8%; Score 180.4; DB 4; Length 794; 

Best Local Similarity 61.0%; Pred. No. 1.7e-32; 

Matches 332; Conservative 6; Mismatches 202; Indels 4; Gaps 3; 

Qy 3174 TTCAGTTGTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGC 3233 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I 

Db 253 T GC G GT GC ACGAT C T GAT TT T CT G GAGAGAT GT GAAGAAGACT GGGT T T GT CT T T G — GA 310 

Qy 3234 CAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCC-TACA 3292 

II I I I I I I I I I I I I I I I I I I I I I I I I 111:111 IN: 

Db 311 CACGCTGATCATGCTGCTTTCCCTGGCAGCTTTCAGTGTCATCARTGTGGGTTTCTTAMC 37 0 

Qy 3293 TTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGG 3352 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 
Db 371 T CAT CCTGGCTCTTCTCTCTGT C AC CAT C ARC T T CAGGAT CT ACAAGT C C GT CAT C CAAG 430 

Qy 3353 C TAT C CAGAAAT CAGAT GAAGG C C AC C CAT T - C AGGGC AT AT T T AGAAT C T GAAGT T GC T 3411 

I I I I I I I I : I I I I I : I I I I I I I I I I : I I I I I I I III II I I II 
Db 431 CT GTWCAGAART CAGAARAAG GC CAT C C AWT C CAAAGC CT AC CT GGAC GT AGAC AT TACT 4 90 

Qy 3412 ATATCAGAGGAATTGGTTCAGAAATACAGTAATTCTGCTCTTGGTCATGTGT^CAGCACA 3471 

III Mi I I I I I I I I I I I I I I II II II I I II I I I 

Db 4 91 CT GT C CT C AGAAG CT T T C C ATAAT T AC AT GAAT GCT GC CAT GGT GC AC AT C AAC AG GGC C 550 

Qy 3472 ATAAAAGAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCA 3531 

I I I I M I I I I I I I I I I I I I I I I I I II I I I I I II I I I I 

Db 551 CTGAAACTCATTATTCGTCTCTTTCTGGTAGAAGATCTGGTTGACTCCTTGAAGCTGGCT 610 

Qy 3532 GTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATT 3591 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II II II I I I I I I I I II 
Db 611 GTCTTCATGTGGCTGATGACCTATGTTGGTGCTGTTTTTAACGGAATCACCCTTCTAATT 670 

Qy 3592 TTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTATGAACGGCATCAGGTGCAGATA 3651 

1 I I I I I I I I I I I I I I I I I I I I I I II II I I I I I 

Db 671 CT T GCT GAAC T GCT CAT T T T C AGT GT C C C GAT T GT CT AT GAGAAGT ACAAGAC C CAGAT T 730 

Qy 3652 GAT CAT TAT CT AGGAC T T GCAAACAAGAGT GT TAAG GAT GC CAT GGC CAAAAT C CAAGC A 3711 

M I I I I I I I I I III I M I II I I I I I I I 

Db 731 GAT CACT AT GT T GGC AT C G C C C GAGAT CAGAC CAAGT CAAT T GT T GAAAAGAT C C CAAGC 790 

Qy 3712 AAAA 3715 

I I I I 

Db 791 AAAA 794 

RESULT 10 
US-08-700-607-9 

; Sequence 9, Application US/08700607 

; Patent No. 5858708 

; GENERAL INFORMATION: 

APPLICANT: Bandman, Olga 



APPLICANT: Au-Young, Janice 
APPLICANT: Goli, Surya K. 
APPLICANT: Hillman, Jennifer L. 

TITLE OF INVENTION: TWO NOVEL HUMAN NSP-LIKE PROTEINS 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 
CITY: Palo Alto 
STATE: CA 
COUNTRY: U.S. 
ZIP : 94304 
COMPUTER READABLE FORM: 
MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 1.5 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/700, 607 
FILING DATE: Filed Herewith 
ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J. 
REGISTRATION NUMBER: 36,749 
REFERENCE/DOCKET NUMBER: PF-0114 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-8 55-0555 
TELEFAX: 415-8 45-4166 
INFORMATION FOR SEQ ID NO: 9: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 261 base pairs 
TYPE: nucleic acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
IMMEDIATE SOURCE: 

LIBRARY: SPLNFET01 
CLONE: 28742 
US-08-700-607-9 

Query Match 4.4%; Score 164.6; DB 2; Length 261; 

Best Local Similarity 86.7%; Pred. No. 4.4e-29; 

Matches 176; Conservative 0; Mismatches 27; Indels 0; Gaps 0; 

Qy 3237 CTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGC 3296 

I Ml M I I 1 I I I II I I I I I I I II I I I I I I II I II I I I I I I I I I I I I II I I 
Db 1 C C T ATN C CN GCTGCTTT CATT GAC AGT AT T CAGCAT T GT GAGC GT AAC AGC CT AC AT T GC 60 

Qy 32 97 CTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTAT 3356 

Ml I I I II I I III I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II II I I I 
Db 61 CTTNGCCCTGCNCTCTGTGACCATCAGCTNTAGGCTATACAAGGGTGTGATCCAAGCTAT 120 

Qy 3357 C CAGAAAT C AGAT GAAGGC C AC C CAT T C AGG GCAT AT T T AGAAT CT GAAGT T GCT AT AT C 3416 

I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 121 C CAGAAAT C AGAT GAAGGN C AC C CAT T C AGG GCAT AT CT G GANT CT GAAGT T GC T AT AT C 180 



Qy 



3417 AGAG G AAT T G GT T CAGAAAT AC A 3439 
I I I I I III I I I I II I I I I I 



Db 



181 T GAG GAGT T GN T T C AGAAGT AC A 203 



RESULT 11 

US-08-232-463-14/C 

; Sequence 14, Application US/08232463 
; Patent No. 5670367 

GENERAL INFORMATION: 
; APPLICANT: DORNER, F. 

APPLICANT: SCHEIFLINGER, F. 

APPLICANT: FALKNER, F. G. 

TITLE OF INVENTION: RECOMBINANT FOWLPOX VIRUS 
NUMBER OF SEQUENCES: 52 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: Foley & Lardner 

STREET: 1800 Diagonal Road, Suite 500 

CITY: Alexandria 

STATE: VA 

COUNTRY: USA 
; ZIP : 22313-0299 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/232 , 4 63 

FILING DATE: 
CLASSIFICATION: 435 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/07/935,313 
FILING DATE: 
; APPLICATION NUMBER: EP 91 114 300.6 

; FILING DATE: 26-AUG-1991 

ATTORNEY/AGENT INFORMATION: 
NAME: BENT, Stephen A. 
REGISTRATION NUMBER: 29,768 
REFERENCE/DOCKET NUMBER: 30472/114 IMMU 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (703)836-9300 

; TELEFAX: (7 03)683-4109 

TELEX: 899149 
INFORMATION FOR SEQ ID NO: 14: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 7218 base pairs 
TYPE: nucleic acid 
STRANDEDNESS: single 
; TOPOLOGY: linear 

IMMEDIATE SOURCE: 
CLONE: pTZgpt-Fls 
US-08-232-463-14 

Query Match 2.0%; Score 75.4; DB 1; Length 7218; 

Best Local Similarity 5.3%; Pred. No. 1.9e-07; 

Matches 22; Conservative 242; Mismatches 153; Indels 0; Gaps 0; 



Qy 1127 T T AAAGGCT C C C CAAAAGGAGAGT CAGC CAT AT TAGT AGAAAACAC T AAG GAAGAAGT AA 118 6 



Db 1459 TTAAAGAGATAGAAGAATTTGGTACRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRrrr 14 00 

Qy 118 7 T T GT GAGGAGT AAAGACAAAGAG GAT T T AGT T TGTAGT GCAGC C CT T C AC AGT C C AC AAG 1246 

:::::::::::::::::::: : : :::::: : : : : : : : 

Db 1399 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 134 0 

Qy 1247 AAT C AC C T GT GGGT AAAGAAGAC AGAGT TGTGTCTC C AGAAAAGACAAT GGAC AT T T T T A 1306 

: : : :::::::::::::::: : : :::::::::::::: : 

Db 133 9 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR 12 80 

Qy 1307 AT GAAAT GC AGAT GT C AGT AGT AGC AC CT GT GAGGGAAGAGTAT GC AGAC T T T AAGC CAT 1366 

Db 127 9 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRrrrrr 1220 

Qy 1367 T T GAACAAGCAT G G GAAGT GAAAGAT ACT T AT GAGGGAAGT AGGGAT GT G C T GG CT GCT A 1426 

: : : : : : : :::::: :::::: : : :::::::: : : : : : : : : : : : 
Db 1219 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRrrrrr 1 1 6 0 

Qy 142 7 GAGCTAAT GTGGAAAGTAAAGT GGAC AGAAAATGCTTGGAAGAT AGC CT GGAGCAAAAAA 14 8 6 

Db 1159 RRRRRRRRRRRRRRRRRRrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr 1100 

Qy 148 7 GT CT T GGGAAG GAT AGT GAAGGC AGAAAT GAG GAT GCTTCTTTC C CCAGT AC C C CAG 1543 

: :::::::::::::::::::: ::::| I | I | | | | | | | 

Db 1099 RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRATCGCAAGCTCCCTCGACCTGCAG 1043 



RESULT 12 
US-09-128-155-16 

; Sequence 16, Application US/09128155 
; Patent No. 6117654 
; GENERAL INFORMATION: 
; APPLICANT: Pan, Yang 

; TITLE OF INVENTION: NOVEL MOLECULES OF TANGO-77 RELATED PROTEIN FAMILY 

; TITLE OF INVENTION: AND USES THEREOF 

; FILE REFERENCE: 09404/052001 

; CURRENT APPLICATION NUMBER: US/09/128,155 

; CURRENT FILING DATE: 1998-08-03 

; EARLIER APPLICATION NUMBER: US 60/091,650 

; EARLIER FILING DATE: 1998-07-02 

EARLIER APPLICATION NUMBER: US 60/054,646 
; EARLIER FILING DATE: 1997-08-04 
; NUMBER OF SEQ ID NOS : 18 

SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 16 

LENGTH: 152331 
TYPE: DNA 

ORGANISM: Homo sapiens 
FEATURE : 

NAME/KEY: misc_f eature 
LOCATION: ( 1 )...( 152331 ) 
OTHER INFORMATION: n = A,T,C or G 
US-09-128-155-16 



Query Match 2.0%; Score 75.2; DB 3; Length 152331; 

Best Local Similarity 53.8%; Pred. No. 1.2e-06; 



Matches 155; Conservative 



0; Mismatches 133; Indels 



0; Gaps 



0; 



Qy 



463 CCGCCCGCCGCCGCCGCGCCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCC 522 

M I Ml Ml I INI I III I I I II M I I I I I I I I I I 

21936 CCCCGCCCCCCCGGCCCGCCCCCCGCGGCCCCCCACCCCCCCCCCCCCCCCCCCGCGCCC 21995 



Db 



Qy 



523 CGCGGGCCGCTGCCGGCCGCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGC 582 
Ml Ml I I I I I I I II I I I I I I I I || || | 
21996 CGCCCCCCCCCCCGCGCCCCCCACCCCCCCGCCCCCCCGCCCCCCCCCCCCCCCCCCACC 22 055 



Db 



Qy 



Db 



583 AGCCCCGCGGCGCCCGCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTC 642 
I I I I I I I I I I III I I I I I I I I I I I I | | | | | || 
22056 CCCCACACCCGGCCCACACGCACCCCCCACCCCGACGCCCCCGCCCCCCCCCCCCCGCAG 22115 



Qy 



643 CCAGAGGACGACGAGCCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCC 702 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml 
22116 CCGACGCCCCCCCCCCGCCCGCCCCGCCCCGCACCCCCGACCCCCCCCGCCGCCCCGCCC 22175 



Db 



Qy 



703 CTGGCGGAGCCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 
Ml I I I I I I I I I II III I I II I I I I I I I 

22176 CCGCCCCCCCCCCCCGGCCCCCCCCCCGCCGGCGCGGCGCCCCACCCC 22223 



Db 



RESULT 13 

US-09-894-998A-35/c 

; Sequence 35, Application US/09894998A 

; Patent No. 6537555 

; GENERAL INFORMATION: 

; APPLICANT: Hosken, Nancy Ann 

; APPLICANT: Craig H. Day 

; APPLICANT: Davin C. Dillon 

; APPLICANT: McGowan, Patrick 

; APPLICANT: Sleath, Paul R. 

; TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE DIAGNOSIS AND 
; TITLE OF INVENTION: TREATMENT OF HERPES SIMPLEX VIRUS INFECTION 
; FILE REFERENCE: 210121.538 

; CURRENT APPLICATION NUMBER: US/ 09/ 8 94 , 998A 
; CURRENT FILING DATE: 2001-06-28 
; NUMBER OF SEQ ID NOS : 64 

SOFTWARE: FastSEQ for Windows Version 4.0 
; SEQ ID NO 35 

LENGTH: 24 81 

TYPE: DNA 

ORGANISM: HSV-2 
US-09-894-998A-35 

Query Match 2.0%; Score 74.6; DB 4; Length 2481; 

Best Local Similarity 48.7%; Pred. No. 1.7e-07; 

Matches 203; Conservative 0; Mismatches 214; Indels 0; Gaps 0; 

Qy 327 GT AC C AGT T C GT GAC GGAG C C C GAGGAC GAG GAGGAC GAGGAGGAGGAG GAGGAC GAGGA 38 6 

IN I I I M I I II || I | I I I I I I I I I I I I I I I I I I I I 

Db 1920 GGAC GC G GAC G C GAC G C T C C CAC CAGC C C C G C C C GC AGAG GAAGAG G C GGAG GAG GAG GA 1861 

Qy 3 87 GGAGGAC GAC GAG GAC CT AGAG GAACT G GAGGT G CT G GAG AGGAAGC CC GC AG C C G G G CT 44 6 

I I I I I I I I I I I I I I I I I Mill I I II I I I I I I | II 

Db 18 60 G GC G GAG GAG GAGGAGGC GGAGGAGGAGGAG G C GGAGGAG GAG GAGGCGGAG GAG GAG GA 18 01 



Qy 447 GTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGACTTCAGCAGCGACTC 506 

I M I I II I N I I I I I I I I I I || | i | | 

Db 1800 GGCGGAGGAGGAGGAGGCGGAGGAGGAGGAGGCGGCGGCGACCGCGGCCTGGGACGACGG 1741 

QY 507 GGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCCGCTCCTGAGAGGCA 566 

I M I I I I I I | | | I I I I | | | | | Ml 

Db 1740 AGACGCCGACGGGGGCGCGGCGCCCGCGGACGCCGGGGCGAGCGGCCCGTGGCCGCGGTC 1681 

Qy 567 GCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCGCCCGCTGCCGCAGT 62 6 

IN I I I I I I I I I I I I I II I IN | M I 

Db 168 0 GCCCGAGTCCGAGTCCGGGGCCCGGCGCGGCGCCGCCCTCTTGGCCCCCACCCCCTGGGG 1621 

Qy 627 CCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCCCCGCCTCCGCCGCC 68 6 

I I I II I I I I I I I I I II I || | M 

Db 162 0 GGCGAGGGGCGAGCGCGGGGCGGCGGAGGAAGAGGCGGAGGACGAGGCCGCGGGGCCCGA 1561 

Qy 687 AGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACGCCGGCCGCGCC 743 

Ml I II III I I I I I I I I I I I I I I I I I I II I I III 

Db 1560 GTCCGACCCGCGCCTCTTCCGGGGGCGGGCCGCCGCCCCCTCCGCGGCGTGGGGGGC 1504 



RESULT 14 

US-09-103-840A-2/c 

; Sequence 2, Application US/09103840A 

; Patent No. 6294328 

; GENERAL INFORMATION: 

; APPLICANT: FLEISCHMAN, Robert D. 

; APPLICANT: WHITE, Owen R. 

; APPLICANT: FRASER, Claire M. 

; APPLICANT: VENTER, John C. 

; TITLE OF INVENTION: DNA SEQUENCES FOR STRAIN ANALYSIS IN MYCOBACTERIUM 

; TITLE OF INVENTION: TUBERCULOSIS 

; FILE REFERENCE: 24366-20007.00 

; CURRENT APPLICATION NUMBER: US/09/103, 840A 

; CURRENT FILING DATE: 1998-06-24 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 2 

LENGTH: 4403765 

TYPE: DNA 

; ORGANISM: Mycobacterium tuberculosis 
FEATURE : 

OTHER INFORMATION: CDC 1551 

OTHER INFORMATION: "n" bases at various positions throughout the sequence 
OTHER INFORMATION: represent a, t, c or g 
US-09-103-840A-2 



Query Match 2.0%; Score 73.2; DB 3; Length 4403765; 

Best Local Similarity 52.3%; Pred. No. 2.2e-05; 

Matches 162; Conservative 0; Mismatches 148; Indels 0; Gaps 0; 

Qy 434 CCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGACT 4 93 

I I I I I I I I I I II I I I II I I I || | | | | I I M | I I 

Db 392 634 6 CCGTGCCGGCGCTGCCCGCGCCGCCGGCGCCGCCTTGGCCGCCGGTGCCGCCGATACCGG 
3926287 



Qy 4 94 TCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCCG 553 

> II M M I I I I I I I I I I I I I I I I I I I I I I | I I 

Db 3926286 CCTTGCCCGCGGCGCCGACAACCCCGCCGGTTCCTCCGGTGCCGGCGGCCCCGCCGGCCC 
3926227 



QY 554 CTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCGC 613 

I I I I I I I I I II II II II I MINI II l| I 

Db 3 92 6226 CGCCGGCGCCGGCGTTACCGCCAGTCCCACCCGCGCCGCCGTCGGCGCCAATCCCGCTGG 
3926167 



Qy 614 CCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCCC 673 

I I IN Ml I I I I I I I I I I I I I I I I I III 

Db 3926166 CATTATCAGCACCGGAGCCACCCATGCCGCCGGCGCCGCCTTGGCCGCCGGTGCCGCCGG 
3926107 



Qy 674 CGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACGC 733 

I II Ml Ml M I I I I I I III II Mill I III 

Db 3926106 CACCACCGGAGCCGTTGATGCCGCCGGCAATGGCGTTGCCGCCCTGGCCGCCGACGCCGC 
3926047 



Qy 734 CGGCCGCGCC 743 

II I I I II I I 
Db 3926046 CGGCCCCGCC 3926037 



RESULT 15 

US-09-103-840A-l/c 

; Sequence 1, Application US/09103840A 

; Patent No. 6294328 

; GENERAL INFORMATION: 

; APPLICANT: FLEISCHMAN, Robert D. 

; APPLICANT: WHITE, Owen R. 

; APPLICANT: FRASER, Claire M. 

; APPLICANT: VENTER, John C. 

; TITLE OF INVENTION: DNA SEQUENCES FOR STRAIN ANALYSIS IN MYCOBACTERIUM 

; TITLE OF INVENTION: TUBERCULOSIS 

; FILE REFERENCE: 24366-20007.00 

; CURRENT APPLICATION NUMBER: US/ 09/ 103 , 84 OA 

; CURRENT FILING DATE: 1998-06-24 

; NUMBER OF SEQ ID NOS : 2 

; SOFTWARE: PatentlnVer. 2.1 

; SEQ ID NO 1 

LENGTH: 4411529 
; TYPE : DNA 

; ORGANISM: Mycobacterium tuberculosis 

OTHER INFORMATION: H37Rv 
US-09-103-840A-1 



Query Match 2.0%; Score 73.2; DB 3; Length 4411529; 

Best Local Similarity 52.3%; Pred. No. 2.2e-05; 

Matches 162; Conservative 0; Mismatches 14 8; Indels 0; Gaps 0; 

Qy 434 CCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGACT 4 93 

I M I I || II II I I I I I I II I M I I I I I II M I I 

Db 3932558 CCGTGCCGGCGCTGCCCGCGCCGCCGGCGCCGCCTTGGCCGCCGGTGCCGCCGATACCGG 
3932499 



Qy 4 94 TCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCCG 553 

I II M M I I 1 II I I I I I I I I I I I I I II I I I II 

Db 39324 98 CCTTGCCCGCGGCGCCGACAACCCCGCCGGTTCCTCCGGTGCCGGCGGCCCCGCCGGCCC 
3932439 

Qy 554 CTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCGC 613 

I I I I I I I II I I || I I 1 I I I I I I | | M | I I 

Db 3932438 CGCCGGCGCCGGCGTTACCGCCAGTCCCACCCGCGCCGCCGTCGGCGCCAATCCCGCTGG 
3932379 

Qy 614 CCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCCC 673 

I I I I I IN Ml II I I I I I I I I II I I III 

Db 3932378 CATTATCAGCACCGGAGCCACCCATGCCGCCGGCGCCGCCTTGGCCGCCGGTGCCGCCGG 
3932319 

Qy 674 CGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACGC 733 

I I I I I I Ml M I I II I I III II I I I I I | Ml 

Db 3932318 CACCACCGGAGCCGTTGATGCCGCCGGCAATGGCGTTGCCGCCCTGGCCGCCGACGCCGC 
3932259 

Qy 734 CGGCCGCGCC 743 

I I I I I MM 
Db 3932258 CGGCCCCGCC 3932249 



Search completed: September 11, 2004, 16:11:41 
Job time : 242.145 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM nucleic - nucleic search, using sw model 



Run on: 



Title: 



September 11, 2004, 01:22:50 ; Search time 1590.94 Seconds 

(without alignments) 
11831.342 Million cell updates/sec 



US-09-830-972-1 
Perfect score: 3741 

Sequence: 1 attgctcgtctgggcggcgg gattgaagcgcaaagcagat 3741 

Scoring table: IDENTITY_NUC 

Gapop 10.0 , Gapext 1.0 

Searched: 3304383 seqs, 2515761380 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 
Maximum DB seq length: 2000000000 



6608766 



Post-processing : 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Published Applications NA: * 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 
19 



/cgn2_6/ptodata/l/pubpna/US07_PUBCOMB. seq: * 
/cgn2_6/ptodata/l/pubpna/PCT_NEW_PUB. seq: * 
/cgn2_6/ptodata/l/pubpna/US06_NEW_PUB. seq:* 
/ cgn2_6/ptodata/ 1/pubpna/US 0 6_PUBCOMB . seq : * 
/cgn2_6/ptodata/l/pubpna/US07_NEW_PUB. seq:* 
/ cgn2_6/ptodata/ 1 /pubpna/ PCTUS_PUBCOMB . seq : * 
/ cgn2_6/ptodata/ 1/pubpna/US 0 8_NEW_PUB . seq : * 
/cgn2_6/ptodata/l/pubpna/US08_PUBCOMB. seq:* 
/cgn2_6/ptodata/l/pubpna/US09A_PUBCOMB. seq: * 
/cgn2_6/ptodata/l/pubpna/US09B_PUBCOMB. seq: 
/cgn2_6/ptodata/l/pubpna/US09C_PUBCOMB. seq: 
/cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq: * 
/cgn2_6/ptodata/l/pubpna/US09_NEW_PUB. seq2 : 
/cgn2_6/ptodata/l/pubpna/US10A_PUBCOMB. seq: 
/cgn2_6/ptodata/l/pubpna/US10B_PUBCOMB. seq: 
/cgn2_6/ptodata/l/pubpna/US10C_PUBCOMB. seq: 
/cgn2_6/ptodata/l/pubpna/US10_NEW_PUB. seq: * 
/cgn2_6/ptodata/l/pubpna/US60_NEW_PUB. seq: * 
/cgn2__6/ptodata/l/pubpna/US60_PUBCOMB. seq: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 

US-09-893-348-17 

; Sequence 17, Application US/09893348 
; Patent No. US20020072493A1 
; GENERAL INFORMATION: 



; APPLICANT: EI SENBACH- SCHWARTZ , Michal 
; APPLICANT: COHEN, Irun R. 
; APPLICANT: BESERMAN, Pierre 
; APPLICANT: MOSONEGO, Alon 
; APPLICANT: MOALEM, Gila 

; TITLE OF INVENTION: ACTIVATED T-CELLS, NERVOUS SYSTEM- SPECIFIC ANTIGENS AND 
THEIR USES 

; FILE REFERENCE: EIS-SCHWARTZ=2A 

; CURRENT APPLICATION NUMBER: US/09/893,34 8 

; CURRENT FILING DATE: 2001-06-28 

; PRIOR APPLICATION NUMBER: US 09/314,161 

; PRIOR. FILING DATE: 1999-05-19 

PRIOR APPLICATION NUMBER: US 09/218,277 

PRIOR FILING DATE: 1998-12-22 
; PRIOR APPLICATION NUMBER: PCT/US98/14715 

PRIOR FILING DATE: 1998-07-21 
; PRIOR APPLICATION NUMBER: IL 124500 
; PRIOR FILING DATE: 1998-05-19 
; NUMBER OF SEQ ID NOS : 29 
; SOFTWARE: Patentln version 3.1 
; SEQ ID NO 17 

LENGTH: 4684 
TYPE: DNA 

ORGANISM: Rattus norvegicus 

FEATURE: 

NAME/ KEY : CDS 

LOCATION: (253) (3744 ) 
; OTHER INFORMATION: 
US-09-893-348-17 

Query Match 100.0%; Score 3739.4; DB 9; Length 4684; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 3740; Conservative 0; Mismatches 1; Indels 0; Gaps 0; 

QV 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

I I I M I I I I II I I I I II 1 I I I I I I I I | | | | | | M | M | | | | | | | | | | | | | | | | | | 

Db 1 ATTGCTCGTCTGGGCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCG 60 

QY 61 ATCGCGAAGGCAGCAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 120 

I M I I I I I I I I I I I I I I II || I I I | | | | | | | | | | | | | | || | | | | | | | | | | | | | | | | | | | 
Db 61 ATCGCGAAGGCAGGAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTT 12 0 

Qy 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

I I I I I I I I I I I I I I M M I I I I I I I I I I I I I I I I I I I I | | | M | | | | | | | | | | | | | | | | | 
Db 121 CGGCTCGGCTCGGCACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACA 180 

Qy 181 ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 24 0 




Db 



181 



ACCGCCCGCGACTCTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGC 240 



Qy 



241 



GAC C CG C C AGC CAT GGAAGAC AT AGAC C AGT CGTCGCTGGTCTCCTCGTC CAC G GAC AGC 300 




Db 



241 



GAC C CGC C AG C CAT GGAAGAC AT AGAC CAGT CGTCGCTGGTCTCCTC GT CC AC GGAC AG C 300 



Qy 



301 



CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 




Db 



301 



CCGCCCCGGCCTCCGCCCGCCTTCAAGTACCAGTTCGTGACGGAGCCCGAGGACGAGGAG 360 



Qy 3 61 GAC GAGGAGGAGGAGGAGGACGAGGAGGAGGACGACGAGGAC CTAGAGGAACT GGAGGT G 420 

M I I I I I I I I II I I I I II I I I I M I I I I i I I I I I I I I I I I I I I I I | | | I I I I M I | | | | | 
Db 3 61 GAC GAGGAGGAGGAGGAGGAC GAGGAGGAGGACGAC GAGGACCTAGAGGAACT GGAGGT G 42 0 

Qy 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 48 0 

I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | 
Db 421 CTGGAGAGGAAGCCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCG 480 

Qy 4 81 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 54 0 

M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | || | | | | | | | | | | M 
Db 4 81 CCGCTGCTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCC 54 0 

Qy 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | M I I I I I I I I I I I I I I I I I I I I I | I I I 
Db 541 GCGCCCCCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCG 600 

Qy 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I 
Db 601 CCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCT 660 

Qy 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I 
Db 661 CCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCG 72 0 

Qy 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 7 80 

M I I I I I I I I I I I M I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 721 CCCCCTTCCACGCCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTT 7 80 

Qy 781 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 7 81 TTTGCTCTTCCTGCTGCATCTGAGCCTGTGATACCCTCCTCTGCAGAAAAAATTATGGAT 84 0 

Qy 841 T T GAT GGAG C AGC CAG GTAAC ACT GTTTCGTCTGGT CAAGAGGAT T T C C CAT CT GT C C T G 9 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I II I I I I I I 
Db 841 T T GAT G GAGC AGC CAGGT AACAC TGTTTCGTCT GGT CAAGAGGAT T T C C CAT C T GT C C T G 900 

Qy 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I II I I I I I I I 
Db 901 CTTGAAACTGCTGCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAA 960 

Qy 961 CAT G GAT AC CT T GGT AACT T AT CAG C AGT GT CAT C CT C AGAAG GAACAAT T GAAGAAAC T 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II II I I I I I I I I || 
Db 961 CAT GGAT AC CT T G GTAAC T TAT C AGC AGT GT CAT C C T C AGAAGGAACAAT T GAAGAAAC T 1020 

Qy 1021 TTAAATGAAGCTTCTAAAGAGTTGCCAGAGAGGGCAACAAATCCATTTGTAAATAGAGAT 108 0 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | I II I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1021 TTAAATGAAGCTTCTAAAGAGTTGCCAGAGAGGGCAACAAATCCATTTGTAAATAGAGAT 1080 

Qy 1081 T TAG CAGAAT T T T CAGAAT T AGAAT ATT CAGAAAT GGGAT CAT CT T T TAAAGGC T C CC C A 114 0 

I I I I I I I I I I I M M I I I I I I I I I I II I I I I I II I II I I I I I II I I I I I I I I || I I I I I I 
Db 1081 T TAG CAGAAT T T T CAGAAT T AGAAT ATT CAGAAAT G GGAT CAT CT T T T AAAGGCT C C C C A 1140 

Qy 1141 AAAGGAGAGT CAG C CAT AT T AGT AGAAAACAC T AAG GAAGAAGT AAT T GT GAGGAGT AAA 12 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I II I II I I I I I I I I I I I I I 
Db 1141 AAAGGAGAGT CAGCCAT AT TAGTAGAAAACACTAAGGAAGAAGTAATT GT GAGGAGT AAA 1200 



Qy 1201 GACAAAGAG GAT T T AGT T T GT AGT GC AG C C CT T C ACAGT C C ACAAGAAT C AC C T GT G GGT 1260 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I | | | | 
Db 1201 GACAAAGAGGAT T T AGT T T GT AGT GCAGCCCTTC AC AGT CC ACAAGAAT C AC CTGTGGGT 1260 

Qy 1261 AAAGAAGACAGAGT T GT GT CT C C AGAAAAGAC AAT GGAC AT T T T T AAT GAAAT GCAGAT G 132 0 

I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I | I I I 
Db 12 61' AAAGAAGACAGAGT TGT GT CTC CAGAAAAGACAAT GGACAT TTTTAAT GAAAT GCAGAT G 132 0 

Qy 1321 T CAGT AGT AGCAC C T GT GAGGGAAGAGT AT GCAGACT TT AAGC CAT T T GAACAAG CAT GG 1380 

I I I I II I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 
Db 1321 T CAGT AGT AG CAC CT GT GAGG GAAGAGT AT GCAGACT TTAAG C CAT T T GAACAAGCAT G G 138 0 

Qy 1381 GAAGT GAAAGAT ACT T AT GAGG GAAGT AG G GAT GT GCT G G CT G C T AG AGCT AAT GT GGAA 1440 

I I I I I I I I M I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1381 GAAGT GAAAGAT ACT TAT GAGG GAAGT AGG GAT GT GCTGGCT GCT AGAGCTAAT GT GGAA 144 0 

Qy 1441 AGT AAAGT GGAC AGAAAAT GCT T G GAAGAT AGC C T GGAG CAAAAAAGT C T T G GGAAGGAT 1500 

I I I I M I I I I I II I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 14 41 AGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGGAAGGAT 1500 

Qy 1501 AGT GAAG GC AGAAAT GAG GAT GC T TCTTTCCC CAGT AC C C CAGAAC CT GT GAAGGAC AGC 1560 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1501 AGT GAAGGC AGAAAT GAGGAT GCTTCTTTCCC CAGT AC C CC AGAAC C T GT GAAGGACAGC 1560 

Qy 15 61 T C C AGAG CAT AT AT T AC CTGTGCTTCCTT T AC CT C AGCAAC CGAAAGC AC C ACAGCAAAC 162 0 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I | | | | 
Db 1561 T C CAGAGC AT AT AT T AC CT GT GCT T C CT T T AC CT C AGCAAC C GAAAG CAC CAC AG CAAAC 1620 

Qy 1621 ACTTT CC CT TT GTT AGAAGAT CAT ACTT CAGAAAATAAAACAGAT GAAAAAAAAATAGAA 168 0 

I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 1621 AC TTTCCCTTTGT T AGAAGAT CAT ACT T CAGAAAATAAAACAGAT GAAAAAAAAATAGAA 1680 

Qy 1681 GAAAGGAAG GC C CAAATT AT AAC AGAGAAGACT AGC C C CAAAACGT CAAAT CCTTTCCTT 174 0 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1681 GAAAG GAAG GC C CAAATT AT AAC AGAGAAGACT AGC C C CAAAAC GT CAAAT CCTTTCCTT 1740 

Qy 1741 GT AGC AGT ACAGGAT T CT GAG G C AGAT TAT GT T ACAAC AGAT AC C TT AT CAAAGGT GACT 18 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i 

Db 1741 GTAGCAGTACAGGATT CTGAGGCAGATTATGTT ACAACAGATACCTTAT CAAAGGT GACT 18 00 

Qy 18 01 GAGGCAGCAGTGTCAAACATGCCTGAAGGTCTGACGCCAGATTTAGTTCAGGAAGCATGT 18 60 

II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 1801 GAGGCAGCAGT GT CAAACAT GC CT GAAGGT CT GAC GC CAGATT TAGTT CAGGAAGCAT GT 18 60 

Qy 18 61 GAAAGT GAAC T GAAT GAAG C C AC AGGT ACAAAGAT T GC T TAT GAAACAAAAGT GGACT T G 192 0 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I II 
Db 1861 GAAAGT GAAC T GAAT GAAGC CAC AGGT ACAAAGAT T GCT TAT GAAACAAAAGT GGACT T G 1920 

Qy 1921 GT C CAAACAT CAGAAGCT AT ACAAGAAT C ACT T T AC C C CAC AG CAC AG CT TT GC C CAT C A 1980 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 1921 GT C CAAACAT CAGAAGCT AT ACAAGAAT C ACT T T AC C C C ACAG CACAG C T T T GC C CAT C A 1980 

Qy 1981 T T T GAGGAAG CT GAAG CAAC T C C GT CAC CAGT T T T G C C T GAT AT T GT TAT GGAAG CAC C A 204 0 

I I I I I I I I I I I I I I II I I I I I I I I I II I I I I M I I II I I I II I II I I I I I I I I I I I I I I I 
Db 1981 T T T GAGGAAGCT GAAG CAAC T C C GT CAC CAGT T T T G C CT GAT AT T GT TAT G GAAG CAC C A 2040 

Qy 2041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 



Db 2041 TTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCACTG 2100 

Qy 2101 GAAGC ACC T C CT C C AGT T AGT TAT GAC AGT AT AAAG CT T GAG C C T GAAAAC C C C C CAC C A 2160 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 2101 GAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGCCTGAAAACCCCCCACCA 2160 

Qy 2161 TAT GAAGAAG C CAT GAAT GT AG C ACT AAAAG C T TT G GGAACAAAGGAAG GAAT AAAAGAG 2220 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2161 TAT GAAGAAGCCAT GAAT GTAGCACTAAAAGCTTT GGGAACAAAGGAAGGAAT AAAAGAG 2220 

Qy 2221 C CT GAAAGTT TTAAT GC AG CT GT T C AGGAAAC AGAAG C T C CT TAT AT AT C CAT T GC GT GT 22 80 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I I I II I II I I I I I I I I I I I 
Db 2221 C C T GAAAGT T T T AAT G CAGCT GT T C AGGAAAC AGAAGC T C CT T ATAT AT C CAT T G C GT GT 22 80 

Qy 2281 GATT T AAT T AAAGAAACAAAGCT CT C CAC T GAG C CAAGT C C AGAT T T CT C T AAT T ATT C A 2340 

I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
Db 22 81 GATTTAATTAAAGAAACA7\AGCTCTCCACTGAGCCAAGTCCAGATTTCTCTAATTATTCA 2340 

Qy 2341 GAAAT AGCAAAATT C GAGAAGT CGGTGCCC GAAC ACG CT GAG CT AGT GGAGGATT C CT CA 24 00 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I M I I I I I II I II 
Db 2341 GAAAT AGCAAAATT C GAGAAGT C GGT GC C C GAAC AC G CT GAGCT AGT GGAGGAT T C CT CA 2400 

Qy 2 4 01 C C T GAAT C T GAAC C AGT T GACT T AT T T AGT GAT GAT T CGAT T C C T GAAGT C C CACAAACA 24 60 

I I I I I I I I I I I I I I I I I I I I I I I II I i I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 
Db 2401 C CT GAAT CT GAACC AGT T GACT TAT T T AGT GAT GAT T C GAT T C CT GAAGT C C CACAAACA 2460 

Qy 2461 CAAGAGGAGGCT GT GAT GCTCAT GAAGGAGAGT CT CACT GAAGT GT CT GAGACAGTAGCC 2520 

I II I I II I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II M I I I I I I 
Db 2 4 61 C AAGAGGAG G CT GT GAT G C T CAT GAAGGAGAGT CT CACT GAAGT GT CT GAGAC AGT AGC C 2520 

Qy 2 521 CAGCACAAAGAGGAGAGACTTAGTGCCTCACCTCAGGAGCTAGGAAAGCCATATTTAGAG 2 58 0 

I I I I I I I I I II II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I 
Db 2521 CAGCACAAAGAGGAGAGACTTAGTGCCTCACCTCAGGAGCTAGGAAAGCCATATTTAGAG 2580 

Qy 2581 T CT TT T C AG C C C AATT T AC AT AGT ACAAAAGAT GC T G C AT CTAAT GAC AT T C CAACAT T G 2640 

I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2581 T C TT T T C AG C C CAAT T T AC AT AGT AC AAAAGAT GCT GC AT CTAAT GAC AT T C CAACAT T G 2 64 0 

Qy 2 641 AC CAAAAAG GAGAAAAT T T CT T T G C AAAT GGAAGAGT TTAAT ACT G CAAT T TAT T CAAAT 2700 

II I I I I I M I I I I I I I I I I I I I I I I 1 I I I I I I II II I I I I I I I I II I M I I I I I I I I I I I 

Db 2 641 AC CAAAAAG GAGAAAAT T T CT T T GC AAAT GGAAGAGT T T AAT ACT G CAAT T TAT T CAAAT 27 00 

Qy 2701 GAT GACT TACT T T CTT CT AAGGAAGACAAAAT AAAAGAAAGT GAAAC AT T T T C AGAT T CA 27 60 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I 
Db 2 7 01 GAT GACT TACT T T CTT CT AAGGAAGACAAAAT AAAAGAAAGT GAAAC AT T T T C AGATT CA 2760 

Qy 27 61 T CT C C GAT T GAGATAAT AG AT GAAT T T C C CAC GT T T GT C AGT G C T AAAGAT GATT C T C CT 2820 

I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I II I I I I I II I I I I II I I I I I I I I I I 
Db 27 61 T CT C C GAT T GAGATAAT AGAT GAAT T T C C CAC GT T T GT C AGT G CT AAAGAT GAT T CT C CT 2 82 0 

Qy 2 821 AAAT T AGC CAAG GAGT AC ACT GAT CT AGAAGT AT C C GACAAAAGT GAAAT T G CTAAT AT C 28 8 0 

I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I II 
Db 2821 AAAT T AGC CAAG GAGT AC ACT GAT CT AGAAGT AT C C GACAAAAGT GAAAT T G CTAAT AT C 28 80 

Qy 2881 CAAAGC GG G G C AG ATT CAT T GC CT T G CT T AGAAT T GC C CT GT GAC CTTTCTTT CAAGAAT 294 0 

I II I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



28 81 CAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTCAAGAAT 2 940 



Qy 2941 ATATATC CTAAAGAT GAAGTACAT GTTT CAGAT GAATT CT CC GAAAATAGGT C CAGT GT A 3000 

I I I I I I I I I I I I 1 II I I I I I I I I I II I I II I M M I I I I I I I I II I I I I I I I I I II I I I I 
Db 2 941 AT AT AT C CTAAAGAT GAAGTACAT GTTT CAGAT GAAT T CT C C GAAAAT AG GT C CAGT GT A 3 000 

Qy 3001 T C TAAG G CAT C CATAT C GC CTT CAAAT GT CT CT G CTTTG GAAC CT CAGAC AGAAAT GG GC 3 060 

I I I I I I I I I I I I I I I I I I I I I I I I II II I i II I I I I I I I I I II I I I II I I I I I I I I I I I I 
Db 3001 T C T AAGG C AT C CATAT C GC CT T CAAAT GT CT CT G CTT T GGAAC CT CAGAC AGAAAT GG GC 3060 

Qy 3061 AG C AT AGT T AAAT C CAAAT C ACT T AC GAAAGAAGCAGAGAAAAAACT T C CT T CT GACAC A 3120 

I I I I M I I I II I I I I ! I I II I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I I I I M 
Db 3061 AG C ATAGTTAAAT C CAAAT CACTT ACGAAAGAAGCAGAGAAAAAACTT C CTT CT GACACA 3120 

Qy 3121 GAGAAAGAGGACAGAT C C C T GT CAGCT GT AT T GT CAGCAGAGCT GAGTAAAACT T CAGT T 318 0 

I I M I I I M M 1 I I I I I M i M I M I II I I I I I II t I M M I I I t I M I I M I M II I M 

Db 3121 GAGAAAGAG GACAGAT C C CT GT CAGCT GT AT T GT CAGCAGAGCT GAGTAAAACT T CAGT T 318 0 

Qy 3181 GTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTA 324 0 

I II I I I I i I I I I II I I I II I II I I I I I I II I I I I M I I I I I I I I I I I I I I I I II II I I I I 

Db 3181 GTTGACCTCCTCTACTGGAGAGACATTAAGAAGACTGGAGTGGTGTTTGGTGCCAGCTTA 324 0 

Qy 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGT CAGT GTAACGGCCTACATTGC CTT G 330 0 

II I I I I I I I I I I I I I I I I I I I II I I I I II I M I II I I II I M I I M I I I I I I II I M I II 

Db 3241 TTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCCTTG 3300 

Qy 3301 GCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTATCCAG 3360 

I I I II I I I II II II II I I I I II I M I I I I I I II I II II II I II I M II I II I II II II I I 
Db 3301 GCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTATCCAG 3360 

Qy 3361 AAAT CAGAT GAAG G C CAC C CAT T CAGGGC AT AT TTAGAAT CT GAAGT T G CT AT AT CAG AG 3420 

I I I I I I II I I I M I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3361 AAAT CAGAT GAAG G C CAC C C ATT CAGGGCAT ATTTAGAATCTGAAGTT G CTATAT CAGAG 3420 

Qy 3421 GAAT T GGT T CAGAAAT AC AGT AAT T CT GCT CTT GGT CAT GT GAAC AG CACAAT AAAAGAA 3480 

I II M I i I I I I I I I I II I I I M I I I II I I I t I I I I I I I II I M I I M M I I I 1 I I i I I (I 

Db 3421 GAATTGGTTCAGAAATACAGTAATT CT GCTCTT GGT CAT GT GAAC AG CAC AAT AAAAGAA 3480 

Qy 3481 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATG 3540 

I I I I I I II I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I 
Db 3481 CTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTGATG 3540 

Qy 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

I I I II II I I I I I I II I I I II I I I I I I I M I II I I II I I I M I I II I M I I I I II II I I II 
Db 3541 TGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCTCTG 3600 

Qy 3601 AT CT C ACT CTT CAGT AT T C C T GT T AT T TAT GAACGG C AT CAGGT GC AGAT AGAT CAT TAT 3660 

I I I I I I I II I I I I I I I I I I I II I I M I II I I II II II I I I I I II I I M I M I I I I I I II I 
Db 3601 AT C T C ACT CTT CAGT AT T C CT GT T AT T TAT GAAC GGC AT CAGGT G CAGAT AGAT CAT TAT 3660 

Qy 3661 C T AGGACT T G CAAACAAGAGT GT TAAG GAT G C CAT GG CCAAAAT C CAAGCAAAAAT C C C T 372 0 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I II II I I I I M I I I I I I I I I I I I I I I 
Db 3 661 CT AG GACT T GCAAACAAGAGT GT T AAG GAT G C CAT G GC CAAAAT C CAAG C AAAAAT C C C T 3720 

Qy 3721 G GAT T GAAG C G C AAAG CAGAT 37 41 

I I I I I I I I I I I I I I I I I I I I I 
Db 3721 G GAT T GAAG C G CAAAG CAGAT 3741 



RESULT 2 

US-10-267-502-214 

Sequence 214, Application US/10267502 
Publication No. US2 004 007 17 00A1 
GENERAL INFORMATION: 
APPLICANT: Kim, Jaeseob 
APPLICANT: Galant, Ron 

TITLE OF INVENTION: Obesity Linked Genes 
FILE REFERENCE: LSD-07416 

CURRENT APPLICATION NUMBER: US/10/267, 502 
CURRENT FILING DATE: 2003-01-27 
NUMBER OF SEQ ID NOS : 439 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 214 
LENGTH: 34 92 
TYPE: DNA 

ORGANISM: Mus musculus 
US-10-267-502-214 

Query Match 81.9%; Score 3065.4; DB 12; Length 3492; 

Best Local Similarity 93.6%; Pred. No. 0; 

Matches 328 6; Conservative 0; Mismatches 181; Indels 42; Gaps 7 
Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I MM M M M M M M I I 

Db 1 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCGCGGATAGCCCGCCCCGGCCC 60 

Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAGC CCGAGGAC GAG GAGG AC GAGGAGGAG 372 

M I M M I I M M M M M M M M M M M M M M M M M M M M M M M M 

Db 61 CCGCCCGCTTT CAAGT AC CAGT T C GT GAC GGAGC C CGAGGAC GAG GAGG AC GAGGAAGAC 120 

Qy 373 GAGGAGG AC GAGGAG GAGGAC GAC GAG GAC CT AGAGGAACT GGAGGT GCT GGAGAG GAAG 432 

M M M I M M M M M M M M M M M M M M M M M M M I M M M M I 

Db 121 GAGGAG GAGGAGGAGGAC GAC GAG GAC CTGGAGGAATT GGAGGT GCT GGAGAGGAAG 177 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGAC 492 

M M M M M M M M M M I I MM MM M M M M M M M M M M M 

Db 178 CCCGCAGCCGGGCTGTCCGCGGTTCCGGT GCCCCCCGCCGCCGCACCGCTGCTGGAC 234 

Qy 4 93 TTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCC 552 

M M M M M M M M M M M M M M M M M M M M M M M M M M I M I I I 

Db 235 TTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCCACC 294 

Qy 553 GCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCG 612 

M M M M M M M M M M M M M M M I M M M M I I M M M M M M M M 

Db 295 GCCCCTGAGAGGCAGCCGTCCTGGGAACGCAGCCCCGCGGCGTCCGCGCCATCCCTGCCG 354 

Qy 613 CCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCC 672 

M M M M M M M M I M M M M M M M I M M M M M M M I M Ml 

Db 355 CCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCGGAGGACGACGAGCCTCCAGCG 4 08 

Qy 673 CCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACG 7 32 

I I II I M I M I I II II II M I M II I M I I M M M I I I II I I I II M I I M M M I 

Db 4 09 CGGCCTCCGGCGCCAGCCGGCGCGAGCCCCCTAGCGGAGCCCGCCGCGCCCCCTTCCACG 468 



Qy 733 CCGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCT 792 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I M 
Db 4 69 CCGGCCGCGCCCAAGCGCAGGGGCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCT 528 

Qy 793 G CT GC AT CT GAGC C T GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGAT T T GAT GGAGC AG 852 

I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I 
Db 529 G C T GC AT CT GAGC CT GT GAT AC CCTCCTCTG C AGAAAAAAT TAT G GATT T GAAG GAGC AG 58 8 

Qy 853 CCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCT 912 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 58 9 CCAGGTAACACTGTTTCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGTTTGAAACTGCT 648 

Qy 913 GCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTT 972 

I I I I I I I I I I M I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 64 9 GCCTCTCTTCCTTCTCTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACACGGATACCTT 708 

Qy 973 G GT AAC T TAT CAGCAGT GT CAT C C T C AGAAGGAACAAT T GAAGAAACTT TAAAT GAAGC T 1032 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I 
Db 709 GGTAACT T AT CAGCAGT G G CAT C C AC AGAAGGAACT AT T GAAGAAAC T T TAAAT GAAGC T 768 

Qy 1033 T CTAAAGAGT T G CCAGAGAG GGCAACAAAT C CAT T T GTAAAT AGAGAT T T AGC AGAAT T T 1092 

I I I I II! I I I I I I I I I I I I II I I I I I II I I II I I I I I II I II II I I I I I I I I I II 
Db 7 69 T CTAGAGAATT GCCAGAGAGGGCAACAAAT CCATTT GTAAAT AGAGAGT CAGCAGAGTT T 828 

Qy 1093 T C AGAAT T AGAAT AT T C AGAAAT GGGAT C AT CT T T TAAAGGCT CC C CAAAAGGAGAGT C A 1152 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I II I I I I I I I II I I I I II I I I I I I 
Db 829 T C AGT AT T AGAAT ACT C AGAAAT GGGAT CAT C T T T CAAT G G CT C C C CAAAAGGAGAGT C A 888 

Qy 1153 GC C AT AT T AGT AGAAAACAC T AAGGAAGAAGT AAT T GT GAGGAGT AAAGACAAAGAGGAT 1212 

I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I 
Db 889 GC CAT GT T AGT AGAAAACAC TAAGGAAGAAGT AAT T GT GAGGAGT AAAGACAAAGAGGAT 948 

Qy 1213 T T AGT T T GT AGT GC AG C C CT T C AC AGT C C AC AAGAAT C AC CT GTG 1257 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 94 9 TTAGTTTGTAGTGCAGCCCTTCATAATCCACAAGAGTCACCTGCGACCCTTACTAAAGTG 1008 

Qy 1258 GGTAAAGAAGACAGAGTTGTGT CTCCAGAAAAGACAATGGACATTTTTAATGAAAT GCAG 1317 

I MINIMI I I I I I I I I I I I I I I I I I I I I || I I || I I I I I I I I I I I I I I I I I I 

Db 1009 GTTAAAGAAGAC GGAGTTAT GT CT CCAGAAAAGACAATGGACATTT TTAAT GAAAT GAAA 1068 

Qy 1318 AT GT CAGT AGT AGCAC CT GT GAGGGAAGAGT AT GCAGACT T T AAG C CAT T T GAACAAGC A 1377 

II I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I M I I I II I I I I I M I I I I II I 

Db 1069 ATGTCAGTGGTAGCACCTGTGAGGGAAGAGTATGCAGATTTTAAGCCATTTGAACAAGCA 112 8 

Qy 137 8 T GGGAAGT GAAAGATACTT AT GAGGGAAGTAGGGAT GT GCT GGCT GCTAGAGCTAATGT G 1437 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II 
Db 112 9 T GGGAAGT GAAAGATACTT AT GAGGGAAGTAGGGAT GT GCT GGCT GCTAGAGCTAAT AT G 1188 

Qy 1438 GAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGGAAG 14 97 

I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I M 
Db 118 9 GAAAGTAAAGTGGACAAAAAATGCTTTGAAGATAGCCTGGAGCAAAAAAGTCATGGGAAG 12 4 8 

Qy 1498 GAT AGT GAAGGCAGAAAT G AGGAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGAC 1557 

I I I I M I II I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I M I I 
Db 1249 GAT AGT GAAAGCAGAAATGAGAAT GCTT CT TT C C CCAGTAC C CCAGAACTT GT GAAGGAC 1308 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCTCAGCAACCGAAAGCACCACAGCA 1617 



Db 1309 GG CT C C AGAGC GT AC AT CAC C T GT GAT T C CT T T AC C T C AGCAACC GAGAGT ACT GCAGC A 1368 

Qy 1618 AACACTTT C CCTTTGTT AGAAGAT CAT ACT T CAGAAAATAAAACAGAT GAAAAAAAAAT A 1677 

I I I I I I I I I II II I I I I I I I I II I I I I II I M I I I I I I I I I I II I I I I I I I I II I I 
Db 1369 AAC AT TTTCCCTGT GC T AGAAGAT CAC AC T T CAGAAAATAAAACAGAT GAAAAAAAAAT A 142 8 

Qy 1678 GAAGAAAG G AAG G C C C AAAT T AT AAC AGAGAAGAC T AGC C C C AAAAC GT C AAAT C CT T T C 1737 

I II I I I i I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I M I I 
Db 1429 GAAGAAAGGAAGGCCCAAATTATAACAGAGAAGACTAGC C C CAAAACGT CAAAT CCTTT C 14 8 8 

Qy 1738 CT T GT AG C AGT AC AG GAT T CT GAGG C AGAT TAT GT T ACAACAGAT AC C T TAT CAAAGGT G 17 97 

I I I I I I I I I MM I I I II II I I M M I II I I I II I II I I I I I I I I I I I I I I I I I 
Db 14 89 C T T GT AGCAAT AC AT GAT T C C GAG GCAGAT TAT GT C ACAACAGAT AAT T TAT CAAAGGT G 154 8 

Qy 1798 ACT GAGGC AGC AGT GT CAAACAT G C CT GAAG GT C T G AC G C CAGAT T T AGT T C AG GAAGC A 1857 

I I I I II I I I I MM III I M I I I I II M I I I I II I II II I II I I II I II II I I I II 
Db 154 9 AC T GAG GC AGT AGT G GCAAC CAT G C C T GAAGGT CT AAC GC CAGAT T TAGT T CAGGAAGC A 1608 

Qy 1858 T GTGAAAGT GAACTGAAT GAAGCC ACAGGT ACAAAGAT T GCTTATGAAACAAAAGT GGAC 1917 

I M I I M I M M I II II I II II I I I I M II I I II I I I II I I II I I I I II I I I II II II I 
Db 1609 T GT GAAAGT GAACT GAAC GAAG C C AC AGGT AC AAAGAT T GCT T AT GAAACAAAAGT GGAC 1668 

Qy 1918 T T GGT C CAAACAT C AGAAG CT AT ACAAGAAT CAC T T T AC C C C ACAG CAC AGCT T T GC C C A 1977 

I I I II I II I M II I II I I II I I II I I li III I I II II I I M I I I I I I I I I I I M I II 
Db 1669 T T GGT C CAG AC AT CAGAAGCT AT ACAAGAGT CAAT T T AC C C CACAGCAC AGCT T T GC C C A 1728 

Qy 1978 T C ATTT GAG GAAG C T GAAGCAACT C C GT CAC CAGT T T T G C CT GAT AT T GTT AT G GAAGC A 2037 

I I I I I I I II I I II I I I I I I I I I I I I I I I I II I I I I I II II I I M II I I II I II I I II II 
Db 1729 TCATTTGAGGAAGCTGAAGCAACTCCGTCACCAGTTTTGCCTGATATTGTTATGGAAGCT 1788 

Qy 203 8 CCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTATCCCCA 2 097 

I I I I I I I I II I I I I I I I II I I I I I I M I I I I II I I II I I I II II I I II I I I II II I I 

Db 17 8 9 CCATTAAATTCTCTCCTTCCAAGCACTGGTGCTTCTGTAGCGCAGCCCAGTGCATCCCCA 1848 

Qy 2098 CT GGAAG CAC CT C CT C CAGT TAGT TAT GAC AGT ATAAAGCT T GAG C CT GAAAAC C C CC C A 2157 

II MM III II II II I I I I I II I I I I II I I M M I I II I I II II I I II I I I II I 

Db 184 9 C T AGAAGT AC C GT CT C CAGT TAGT TAT GACG GT ATAAAGCT T GAGC C TGAAAAT C C CC C A 1908 

Qy 2158 C CAT AT GAAGAAGC C AT GAAT GT AGC ACTAAAAG CT T T GG GAACAAAGGAAGGAAT AAAA 2217 

I I II I I I I II II II I I I II I M II I II II I I I I I II I II I I I I I I II I III 
Db 1909 C CAT AT GAAGAAGC CAT GAGT GT AGC ACT AAAAAC AT C GGAC G C AAAGGAAGAAAT TAAA 1968 

Qy 2218 GAGC CT GAAAGT T TT AAT GC AGC T GTT C AGGAAACAGAAGC T C C T TAT AT AT C CAT T G C G 2277 

I I I I II I I I I II I I I I II I I I I I II II II I I I I I I I II I I I I M II II I I I II II I I 
Db 1969 GAGC CT GAAAGT T TT AAT GCAGCT GCT C AGGAAG CAGAAG CT C CT T AT AT AT C CAT TGC A 2028 

Qy 2278 T GT GAT T T AAT T AAAGAAAC AAAG C T C T C CAC T GAG C C AAGT C CAGAT T T C T C T AAT TAT 2337 

I I II I M II I I I II II I I I I I II II I I I II I M I II II I I I I II II II I II II II I I I 
Db 2029 T GT GAT T T AAT T AAAGAAACAAAGC T CT C CAC T GAGC CAAGT C CAG G GT T CT CT AAT TAT 2 088 

Qy 2338 T C AGAAAT AGCAAAAT T C GAGAAGT CGGTGCCC GAAC AC GC T GAGCT AGT G GAGG AT T C C 2397 

I I I I I II I I I II I I I I I I II II I II II I II II III I I I I I I I II II I I I I I I 

Db 2 08 9 T CAGAAAT AGCAAAAT T T GAGAAGT C G GT AC CT GAT C ACT GT GAG CT C GT G GAT GAT T C C 214 8 

Qy 2398 T C AC CT GAAT C T GAAC C AGT T GAC T TAT T TAGT GAT GAT T C GAT T C C T GAAGT C C C AC AA 2457 

II I I I I I I I II I I I II I I II II I II I I II I II I M I I I II II I I I I I I M I I I I I I I I I 



Db 2149 T C AC C C GAAT CT GAAC C AGT T GACT TAT T T AGT GAT GAT T C GAT T C C T GAAGT C C CACAA 2208 

Qy 2458 ACAC AAGAG GAG GCT GT GAT GCT CAT GAAGGAGAGT CT C AC T GAAGT GT CT GAGACAGT A 2517 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I | | | | | | I I I I | | | | I 
Db 2209 AC ACAAGAGGAGGC T GT GAT GCT AAT GAAGGAGAGT C T CAC T GAAGT GT CT GAGACAGT A 2268 

Qy 2518 GCCCAGCACAAA GAGGAGAGAC T T AGT G C C T CAC C T CAGGAG C TAG GAAAGCCAT AT 2574 

I M I I I I I I I II II I I I I I I I I I II I I II II I I I I I I I I I I I I I II I I I I I 

Db 2269 ACAC AAC AC AAAC AT AAG GAGAGAC T T AGT GC T T C AC CT CAGGAGGTAGGAAAG CCAT AT 2328 

Qy 2575 T T AGAGT CT T T T C AGC C C AATT T AC AT AGT ACAAAAGAT GCT GC AT CT AAT GAC AT T C C A 2634 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 2329 TTAGAGT CTTT T CAGCC CAATT TACAT ATTACAAAAGAT GCT GCAT CTAAT GAAATT CCA 2388 

Qy 2635 AC AT T GACCAAAAAG GAGAAAAT T T C T T T GCAAAT G GAAGAGT TTAAT ACT GC AAT T TAT 2694 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I 

Db 2389 ACAT T GAC C AAAAAG GAGACAAT T T C T T T G CAAAT GGAAGAGT T T AAT ACT G CAATTT AT 2448 

Qy 2 695 T CAAAT GAT GACTT ACTT T CTT CTAAGGAAGACAAAATAAAAGAAAGT GAAACATTTT CA 27 54 

II M I I I I I I II I I I I I I I I II I I II II I I I I I I I I I II I I I I I I I I I I I I I I I I I I 

Db 24 4 9 T C CAAT GAT GACTT ACT T T CT T C TAAGGAAGACAAAAT GAAAGAAAGT GAAAC AT TT T C C 2 508 

Qy 2755 GATT CATCT CC GATT GAGATAATAGAT GAATTT CCCAC GTTT GT CAGT GCTAAAGATGAT 2814 

I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I | I M 
Db 2509 GATT CAT CT C CCAT T GAGATAATAGAT GAGTTT CCCAC ATT T GT CAGT GCTAAAGAT GAT 25 68 

Qy 2 815 T C T C CTAAAT T AGC CAAGGAGT AC ACT GAT CT AGAAGT AT C C GACAAAAGT GAAAT T GC T 2874 

I M M I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I II I II I I I I I I 

Db 2569 TCTCCT AAG GAGT AC ACT GAC CT AGAAGT AT C CAACAAAAGT GAAAT T GCT 2619 

Qy 2 875 AATATCCAAAGCGGGGCAGATTCATTGCCTTGCTTAGAATTGCCCTGTGACCTTTCTTTC 2 934 

Ml I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I II I I I 
Db 2 62 0 AATGTCCAGAGCGGGGCCAATTCGTTGCCTTGCTCAGAATTGCCCTGTGACCTTTCTTTC 2679 

Qy 2 935 AAGAAT AT AT AT C C T AAAGAT GAAGT ACAT GTTT C AGAT GAAT T CT C C GAAAAT AGGT C C 2 994 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I || I IN I I I I I I I 
Db 2680 AAGAAT AC AT AT CCT AAAGAT GAAG CAC AT GT CT C AGAT GAAT T CT C CAAAAGT AGGT C C 2739 

Qy 2 995 AGT GT AT C TAAGGC AT C CAT AT C GC CT T CAAAT GTCTCTGCTTTG GAAC C T C AGACAGAA 3054 

I I I I I I I I I I I I I M III II I I II I I I I I II I I I I I I I I I I I I I I I I I I 
Db 274 0 AGT GTATCTAAGGTGCCCTTATTGCTTC CAAAT GTTT CTGCTTTGGAATCT CAAAT AGAA 27 99 

Qy 3055 AT G GGC AG CAT AGT T AAAT C CAAAT CAC T T AC GAAAGAAG C AGAGAAAAAACT T C CTT CT 3114 

I I II I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 800 AT GGGCAACAT AGT TAAACC CAAAGTACTTACGAAAGAAGCAGAGGAAAAACTT C CTT CT 2859 

Qy 3115 GACACAGAGAAAGAGGACAGAT CC CT GT CAGCT GT ATT GT CAGCAGAGCT GAGTAAAACT 3174 

II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I 

Db 2860 GAT ACAGAGAAAGAGGACAGAT CC CT GACAGCT GTATT GT CAGCAGAGCT GAATAAAACT 2919 

Qy 317 5 T CAGTT GT T GAC CT CCT CTACT GGAGAGACATTAAGAAGACT GGAGT GGT G TTTGGT 3231 

I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I II 
Db 2920 T C AGTT GT T GAC C T C C T GT AC T GGAGAGAC AT T AAGAAGAC T G GAGT GGT GT AT T T T GGT 297 9 

Qy 3232 GCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTAC 32 91 

I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 298 0 GCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTAC 3039 



Qy 3292 ATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAG 3351 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I I II I 
Db 3040 ATTGCCTTGGCCCTGCTCTCTGTGACTATCAGCTTTAGGATATATAAGGGTGTGATCCAA 3099 

Qy 3352 G C TAT C C AGAAAT C AGAT GAAGGC C AC C CAT T C AGGGC AT AT T TAGAAT CT GAAGT T G CT 3411 

I I I I M I I I I M I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I II I I I 
Db 3100 GCT AT C CAGAAAT CAGAT GAAG GC CAC C CAT T C AG G GCAT AT T T G GAAT CT GAAGT T G C C 3159 

Qy 3412 AT AT CAGAGGAATT GGTT CAGAAAT AC AGTAATT CT GCT CT T GGT CAT GT GAACAGCACA 3471 

I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I 
Db 3160 AT AT C AGAG GAAT T G GT T CAGAAAT AT AG TAAT TCTGCTCTTG GT CAT GT GAAC AG CAC A 3219 

Qy 3472 AT AAAAGAACT GAGGC GG C T T T T C T T AGT T GAT GAT T T AGT T GAT T C C CT GAAGT T T GC A 3531 

I I I M I I I I I II I I I I II I I I I I I I I I I I I I I II I 1 I I I I I I I I I I I M I I I | | | | | 
Db 3220 ATAAAAGAAT T GAG GCGTCTCTTCT T AGTT GAT GATT T AGT T GAT T C C C T GAAGT T T GC A 3279 

Qy 3532 GT GT T GAT GT G G GT GT T TACT TAT GTT GGT GCCTTGTTCAAT GGT CTGACACTACT GATT 3591 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I II 1 I I I I I I I I I I I 
Db 3280 GT GT T GAT GT GG GTAT T TACT TAC GT T G GT G C CT T GTTCAAT GGT TTGAC AC TACT GATT 3339 

Qy 3592 T T AGCT CT GAT CT CACT C TT C AGT AT T C CT GT TAT T TAT GAAC GGC AT C AGGT GC AGAT A 3651 

I M I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3340 TT AG CT C T GAT C T CAC T CTT C AGT AT T C CT GT TAT AT AT GAAC GGCAT C AGG C GCAGAT A 3399 

Qy 3652 GAT CAT TAT CT AGGACT T GCAAACAAGAGT GT TAAGGAT GC CAT GGC CAAAAT C CAAG C A 3711 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3400 GAT CAT TAT CTAGGACTT GCAAACAAGAGC GTTAAGGAT GC CAT GGCCAAAAT CCAAGCA 3459 

Qy 3712 AAAAT C C CT GGAT T GAAG C G CAAAG C AGA 3740 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 34 60 AAAAT C CCT GGATT GAAGCGCAAAGCAGA 34 88 
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Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I M I 1 I I I III I II I I I I I I II I I I I I I I I I I I I I II I I I I I I I I I II 
Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 75 

Qy 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I M 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I 
Db 135 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC C AGTT C GT GAC G GAGC C C GAGGAC GAGGAGGAC GAGGAGGAG 372 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II II I I I I I I I I II III 
Db 192 C AG C C C GC GT T CAAGT AC CAGTT C GT GAG GGAG C C C GAGGAC GAGGAG GAAGAAGAG 248 

Qy 373 GAGGAGGAC GAG GAGGAGGAC GAC GAG GAC CT AGAG GAACT G GAG GT GCT GGAGAGGAAG 432 

I I I I I I I I I I I I I I I I II I I I I I I I I I I Mill I I I I I II II I I I I I I I I I I I I 
Db 24 9 GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 308 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

I I I I I I I I I I MM Mill M I I II I I II II II II II I II I II I Ml 

Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 428 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

II I I I I II I II I II I II I II I II I I I I II II I I I I I I I 

Db 42 9 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 4 88 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I I I I II I I I II I II I I II I I II I I II II II M I II I M I I II I I I II 
Db 489 GC GCCAT CCC C GCT GTCT GCT GCCGCAGTCTCGCCCTCCAAGCTCCCT GAGGAC GAC GAG 54 8 

Qy 65 8 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I II I I II I II I I II I II II III Ml M II II II III II I 
Db 54 9 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

II I II II I II I I II Mill II I II II II I I I II I I I I 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 



Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 72 8 

Qy 808 GT GAT AC CCTCCTCTG CAGAAAAAAT TAT G GAT T T GAT GGAG CAGCC AGGTAAC ACT GT T 867 

I I I I I I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | II 
Db 729 GTGATACGCTCCTCTGCAGAAAA TAT GGACTTGAAGGAG CAGCC AGGTAAC ACT ATT 785 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 928 CT AT CTCCTCTCT CAACT GT T T CT T T T AAAGAACAT GGAT AC CT T GGT AACT TAT CAGC A 987 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | || | | | || 

Db 846 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 988 GT GT CAT C CT CAGAAGGAACAAT T GAAGAAAC T T T AAAT GAAG C T T C TAAAGAGT T GC C A 1047 

M I I II I I I I I I I II I II MINI I I I I I I I I I I I I I I I I I I I I II 
Db 9 06 GTATTACCCACTGAAGGAACACTTCAAGAAAATGTCAGTGAAGCTTCTAAAGAGGTCTCA 965 

Qy 104 8 GAG AG GG C AAC AAAT C CAT T T GT AAAT AGAGAT T TAG C AGAAT T T T C AGAAT T AGAAT AT 1107 

I I I I I I I I I I I II I I II I I I I I I I I I I I M I I I I I I II I I I I I II I I M 
Db 966 GAGAAGGCAAAAACT CTACT CAT AGAT AGAGATTTAACAGAGTTTT CAGAATTAGAATAC 1025 

Qy 1108 T CAGAAAT G G GAT CAT CT T T TAAAGGCT C C CCAAAAGGAGAGT CAGC CAT AT T AGT AGAA 1167 

I M I I I I I M I I I I I I I II I I IN I I I II I I III II Ml II I I I I I I I 

Db 102 6 T CAGAAAT GGGAT CAT C GT T CAGT GT CT CT C CAAAAG C AGAAT C T G C C GT AAT AGTAGC A 108 5 

Qy 1168 AAC AC T AAG G AAGAAGT AAT T GT GAGGAGT AAA GACAAAGAG GAT T T AGT TT GT AGT 1224 

II I II I I I I I I I I I I I I I I I I I I I I II I I II I I I || I I I II I I 

Db 1086 AAT C CTAGGGAAGAAATAAT CGT GAAAAATAAAGAT GAAGAAGAGAAGTT AGTTAGTAAT 1145 

Qy 1225 GC AG C C C T T C AC AGT C CACAAGAAT C AC CT GT GG GT AAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I II I I I I || I || I I 

Db 114 6 AACAT CCTT CATAAT CAACAAGAGTTAC CTACAGCT CTTACTAAATT GGT TAAAGAGGAT 1205 

Qy 12 70 AG AGT T GT GT CT C CAGAAAAGACAAT G GAC AT T T T TAAT GAAAT GC AGAT GT CAGT AGT A 132 9 

I I I M I I I I I I I I I I I I III I I M I I I I I I I I I I I I | I I I I I | 

Db 12 06 GAAGT T GT GT CT T CAGAAAAAGCAAAAGAC AGT T T TAAT GAAAAGAGAGT T G CAGT GGAA 1265 

Qy 133 0 G C AC C T GT GAG G GAAGAGT AT G C AGAC T T T AAG C CAT T T GAACAAGC AT GG GAAGT GAAA 138 9 

M I I I I I I I I I I II I I I I I I I I I II II I I I I I I I I I II I I I I I I I I I I I I I 
Db 12 66 GCT C CTAT GAGGGAGGAATAT GCAGACTT CAAAC CATTT GAGCGAGT AT GGGAAGT GAAA 132 5 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I 

Db 1326 GATA GTAAGGAAGAT AGT GAT AT GTTGGCTGC T GGAG GT AAAAT C GAGAGCAACT T G 1382 

Qy 1438 GAAAGTAAAGT GGAC AGAAAAT G CT T GGAAGAT AGC C T GGAGCAAAAAAGT CT T GG GAAG 1497 

I II II I II I I I I I I I I I II I I I I I I I I I I I I II I I I I I II III I II 
Db 1383 GAAAGTAAAGT GGATAAAAAAT GT TTT GCAGATAGC CT T GAGCAAACTAAT CACGAAAAA 1442 

Qy 1498 GAT AGT GAAG GC AGAAAT GAG GAT GCTTCTTTCCC CAGT AC C C C AGAAC CT GT GAAGGAC 1557 

I I I I I I I I I II I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 1443 GAT AGT GAGAGT AGTAAT GAT GAT ACT T C T T T CC C CAGT AC GC C AGAAG GT AT AAAGGAT 1502 

Qy 1558 AG C T C C AGAGC AT AT AT T AC CT GT GCTTCCTT T A C C T C AG C AAC C GAAAGC AC C AC A 1614 

I M I M I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 



Db 1503 C GT T CAG GAGCAT AT AT CAC AT GTGCTCCCTT T AAC C C AG C AGCAACT GAGAGC AT T G C A 1562 

Qy 1615 G C AAAC AC TTTCCCTTT GT T AGAAGAT CAT ACT T CAGAAAATAAAAC AGAT GAAAAAAAA 1674 

I I I I II III I I I I I I II I I I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I 
Db 1563 AC AAAC AT TTTTCCTTT GT TAG GAGAT C C T AC T T C AGAAAAT AAGAC C GAT GAAAAAAAA 1622 

Qy 1675 ATAGAAGAAAGGAAGG C C CAAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I II I I I I I 
Db 162 3 ATAGAAGAAAAGAAGGCCCAAATAGTAACAGAGAAGAATACTAGCAC CAAAACAT CAAAC 1682 

Qy 1732 CCTTTCCTT GT AG C AG T AC AG GAT T CT GAGGC AGAT TAT GT T ACAAC AGAT AC CT T AT C A 17 91 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I III II 
Db 168 3 CCT TTT CTT GTAGCAGCACAGGATT CT GAGACAGATTAT GT CACAACAGATAATTTAACA 1742 

Qy 1792 AAGGT GAC T GAG G CAG C AGT GT CAAAC AT GC CT GAAGGT C T GAC GC C AGAT T T AGT T CAG 18 51 

I I I I I I II I I I I I II III I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I III 
Db 174 3 AAG GT GACT GAGGAAGT C GT G G CAAAC AT GC CT GAAGG C CT GAC T C C AGAT T T AGT AC AG 18 02 

Qy 1852 GAAGCAT GT GAAAGT GAACT GAAT GAAGC CACAGGTACAAAGATT GCTTAT GAAACAAAA 1911 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I II I I I II I I I II 
Db 18 03 GAAGCAT GT GAAAGT GAATT GAAT GAAGTT ACT GGTACAAAGATT GCTTAT GAAACAAAA 18 62 

Qy 1912 GT G GACT T GGT CCAAAC AT C AGAAGC T AT ACAAGAAT CAC T T T AC CC C AC AGC ACAGC T T 1971 

I I I I I I I I M I I I I I I I I I I I I I III I I I I I I I I I I II II I I I I I II I I I I 
Db 1863 AT GGAC T T G GT T CAAAC AT C AGAAGT TAT GCAAGAGT CAC T C TAT CCT GC AGCACAGCT T 1922 

Qy 1972 T GC C CAT C ATT T GAG GAAG CT GAAGC AAC T C C GT CAC C AGT T T T G C CT GAT AT T GT TAT G 2031 

I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I 
Db 1923 T GC C CAT CAT T T GAAGAGT C AGAAG C T ACT C CTT CAC C AGT T T T GC CT GAC AT T GT TAT G 1982 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 19 83 GAAGCACCATT GAATT CTGCAGTTCCTAGTGCT GGT GCTTCCGT GAT ACAGC CCAGCTCA 2 042 

Qy 2092 T C C C CACT GGAAGC AC CT C C T C C AGT T AGT TAT GACAGT ATAAAGC T T GAG CCT GAAAAC 2151 

II III I MM II II II I I I I I I I II I II I I I II I I I II II I I I I I I I 

Db 2 04 3 T CAC CAT T AGAAG CTT CTT C AGT TAATT AT GAAAGC ATAAAAC AT GAGC CT GAAAAC 2099 

Qy 2152 C C C C CAC CAT AT GAAGAAG C CAT GAAT GT AG CACT AAAAG C T T T G G GAAC AAAG GAA 2208 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I II I I I I I 
Db 2100 C C CC C AC CAT AT GAAGAG GC CAT GAGT GT AT CAC TAAAAAAAGT AT C AGGAAT AAAG GAA 2159 

Qy 2209 G GAAT AAAAGAGC CT GAAAGT T T T AAT G CAG C T GT T CAGGAAAC AGAAG CT C CT TAT AT A 2268 

I III II II I I I II II I I I II I I I I M I I I I I II I II I I I I II I I I II I I I I I I I 

Db 2160 GAAAT T AAAGAGC CT GAAAAT AT T AAT G C AGCT CTT CAAGAAAC AGAAG C T C CT T AT AT A 2219 

Qy 22 69 T C CAT T GC GT GT GAT T T AAT T AAAGAAACAAAG CT CT C CACT GAG C CAAGT C C AGATT T C 232 8 

II I II I I I II I I I I I I I I I I I I II I II II I I I I II II II III II I MINI 

Db 222 0 TCTATTGCATGTGATTTAATTAAAGAAACAAAGCTTTCTGCTGAACCAGCTCCGGATTTC 22 79 

Qy 2329 T CT AAT TAT T CAGAAAT AG C AAAAT T C G AGAAGT C GGT GC C C GAAC AC G CT GAG CT AGT G 238 8 

II I II I II I I I I I I I I II II I I I I I II I Mill M II I I I I I I I I I I 
Db 2280 T CTGATTAT T CAGAAAT GGCAAAAGTT GAACAGCCAGT GC CT GAT CATT CT GAGCTAGTT 2339 

Qy 238 9 GAGGATTCCTCACCTGAATCTGAACCAGTTGACTTATTTAGTGATGATTCGATTCCTGAA 244 8 

II I I I I I II I I I 1 I II II II II I I I II I I I I II I I I I I II I I I I II I I II I I I I I 
Db 234 0 GAAGATTCCTCACCTGATTCTGAACCAGTTGACTTATTTAGTGATGATTCAATACCTGAC 23 99 



Qy 2449 GT C C CACAAAC AC AAGAGGAG GCT GT GAT G CT CAT GAAG GAGAGT C T C AC T GA A 2502 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 24 00 GT T C C ACAAAAACAAGAT GAAAC T GT GAT GC T T GT GAAAGAAAGT CT C ACT G AGAC T T C A 24 59 

Qy 2503 GT GT CT GAGACAGT AG C C CAGCACAAAGAG GAG AGACT T AGT G C C T C AC C T C AG GAGCT A 2562 

I I III I III I I II I I I I I II I I I III Ml I 

Db 24 60 TT T GAGT CAAT GAT AGAATAT GAAAAT AAGGAAAAACT C AGT G CT T T G C C AC CT GAGG GA 2519 

Qy 2563 GGAAAGC CAT AT T T AGAGT CT T T T C AG C C CAAT T TAC AT AGT ACAAAAGAT GC TGCA 2619 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 2520 GGAAAG C CAT AT T T G GAAT C T T T T AAG CT C AGT T TAGATAAC ACAAAAGAT AC C CT GT T A 2579 

Qy 2620 T CT AAT G AC AT T C CAAC ATT GAC C AAAAAG GAGAAAAT T T CT T T GCAAAT G GAAGAGT TT 2679 

II I I I I II I II I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I III I 

Db 2580 C C T GAT GAAGT T T CAAC ATT GAG CAAAAAG GAGAAAAT T C CT T T GCAGAT GGAGGAGCT C 2 639 

Qy 2 68 0 AAT ACT GCAAT T TAT T CAAAT GAT GACTTACT T T CT T C TAAG GAAGACAAAAT AAAAGAA 2739 

I I I II I I I I I I I II I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I I 
Db 2640 AGT AC T G CAGT T TAT T CAAAT GAT GAC TT AT T TAT T T CT AAGGAAGCAC AGAT AAGAGAA 2699 

Qy 274 0 AGT GAAAC AT T T T C AG AT T CAT C T C C GAT T GAG AT AAT AG AT GAAT T T C C C AC GT T T GT C 27 99 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I II II II II II 
Db 27 00 AC T GAAAC GT T T T C AG AT T CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C C TAC AT T GAT C 2759 

Qy 2 8 00 AGT GCT AAAGAT GAT T C T C CTAAATTAGCCAAGGAGTACACT GAT CT AGAAGTAT C C 2856 

Ml I I I I I I I I I I I I I I I I I II I I I I I III II I I I I I I I I II M I I I I I 
Db 2760 AGTT CTAAAACT GAT T CAT T TT CT AAATT AG C C AGG GAAT AT ACT GAC C T AGAAGTAT C C 2819 

Qy 28 57 GACAAAAGT GAAAT T GCT AAT AT C CAAAGC G GGG CAGAT T CAT TGCCTTGCT T AGAAT T G 2916 

I I I I I I I I I I II I I I I I I I I II Mill I I I I I I I I I I I I I I I I I I I 

Db 2 820 CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 2879 

Qy 2917 CCCT GT GACCTTT CTT T CAAGAAT ATAT AT C CTAAAGAT GAAG TAC AT GT T T C A 2970 

III I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I 

Db 2880 C C CC AT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT C AGT TT CT CA 2939 

Qy 2971 GAT GAAT T CT C C GAAAAT AGGT C CAGT GTAT CT AAGGC AT C CAT AT C G C CT T CAAAT GT C 3030 

I I I I I II I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

Db 2 940 GAT GACT T T T C T AAAAAT G G GT C T G CT ACAT C AAAGGT GCT C TT AT T GC C T C CAGAT GT T 2999 

Qy 3031 TCTGCTTTG GAAC CT C AGACAGAAAT GGG C AG CAT AGT T AAAT C CAAAT C ACT TAC GAAA 3090 

I I I I I I I I I I MM I I II II I II I I I I I I I I I I II I II III MM 
Db 3000 TCTGCTTT G GC CAC T CAAGCAGAGAT AGAGAGC AT AGT TAAAC C CAAAGT T C T T GT GAAA 3059 

Qy 3091 GAAGCAGAGAAAAAACTT C CTT CT GACACAGAGAAAGAGGACAGAT CCCT GT CAGCTGT A 3150 

I I I I I I I I I I I I II I I I M I I I II Mill II I II M M II I I I I II III II 
Db 3060 GAAGCT GAGAAAAAACTT CCTT C CGAT ACAGAAAAAGAGGACAGAT CAC CAT CT GCTAT A 3119 

Qy 3151 T T GT CAGCAGAGCT GAGT AAAACT T C AGT TGT T GAC C T CCT CT ACT G GAGAGAC AT TAAG 3210 

M I I I I I I I I I I I I II I II II I II I I II I M II I I I I I I I I II II I I I II II II I II I 
Db 3120 T T TT CAGCAGAGCT GAGT AAAACT T C AGT T GT T GAC C T C CT GT AC T G GAGAGAC AT TAAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 32 7 0 

M M I I I I I II I I II I I II I I II I I II I II II II I I I I II M I I II I II I M I I I 
Db 318 0 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3239 



Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II 
Db 3240 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3299 

Qy 3331 AT AT AT AAGG G C GT GAT C C AG GCT AT C C AGAAAT C AGAT GAAG GC C AC C C ATT C AG GGCA 3390 

I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I 
Db 3300 AT AT ACAAG G GT GT GAT C CAAG C TAT C C AGAAAT CAGAT GAAGG C C AC C C ATT C AG GG CA 3359 

Qy 3391 TATT T AGAAT CT GAAGT T GCTATAT CAGAGGAAT T GGTT CAGAAATACAGTAATT CT GCT 3450 

Ml I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II I 
Db 3360 TAT C T GGAAT C T GAAGT T GC T AT AT C T GAGGAGT T GGT T C AGAAGT AC AGTAAT T CT GCT 3419 

Qy 3451 CT T GGT CAT GT GAAC AG CACAAT AAAAGAACT GAGGC GG CT T T T C T T AGT T GAT GAT T T A 3510 

I I I I M I I I I I I I I I I I I I I I I I I Mill I I I I I II I I I I I I I I I I I I I I I I I I 

Db 3420 CT T GGT CAT GT GAAC T G C AC GATAAAG GAAC T CAG GC GC CT C T T CT T AGT T GAT GAT T T A 3479 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 357 0 

II I I M I I I I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I I I I I I I II I I I I I I I 

Db 34 80 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

Qy 3571 AAT GGT CT GACACT ACT GAT TT TAG C T CT GAT CT CACT CT T C AGT AT T C CT GT TAT T TAT 3630 

M I I I I I II I I M I II I I I I I I I I I I I I II I M II I I I I I I I I I I I M I I I I I I I I 
Db 3540 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3631 GAACGGCATCAGGTGCAGATAGATCATTATCTAGGACTTGCAAACAAGAGTGTTAAGGAT 3690 

I M I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I III 

Db 3600 GAACG G CAT C AG GC ACAGAT AGAT CAT TAT CT AGGACT T GCAAATAAGAAT GT T AAAGAT 3659 

Qy 3691 G CCAT GGC CAAAAT C CAAG CAAAAAT C CCT G GAT T GAAG C GCAAAGC AGA 3740 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I II 

Db 3660 GCT AT GGC T AAAAT C C AAGCAAAAAT C C C T GGAT T GAAGC GCAAAGC T GA 3709 
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Query Match 62.6%; Score 2343.6; DB 9; Length 4053; 

Best Local Similarity 81.3%; Pred. No. 0; 

Matches 3017; Conservative 0; Mismatches 574; Indels 119; Gaps 21 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I M IN I I Ml I I I I I I I I I I I I I I I I I II I I I II I I I I I I I I I I I II 
Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 75 

Qy 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I II I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I || 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I M I II I II I I I I I I I I I I I I I II I I I II I I I I I I I I II II I I I I I I II 
Db 135 AT GGAAGAC C T GGAC CAGT CTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC G GAG C C C GAG GAC GAG GAGGAC GAG GAG GAG 372 

I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I II II III 
Db 192 CAGC C C G C GT T CAAGT AC C AGTT C GT GAGGGAGC C C GAGGAC GAGGAG GAAGAAGAG 24 8 

Qy 373 GAGGAGGAC GAGGAG GAGGAC GAC GAGGAC C T AGAG GAAC T G GAG GT GCT G GAGAGGAAG 432 

I I I M I I I I I I I I I I I II I I I I I Mill I II I I II II I I I I I I I I II I I II I I I 
Db 2 4 9 GAGGAG GAAGAG GAG GAC GAGGAC GAAGACCT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 30 8 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

I I I I I M I I I I I I I I I I II I I I I I I I I I I I I 1 I I I I I I I I || I I I 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 428 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I I I M I M I I I I I I I I I I I I I I I I I I I I 

Db 429 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 4 88 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I M I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I M I 
Db 4 89 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 54 8 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I I I I I II I II I I I I I II II I II III I I II I I I I III Ml 
Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I M M I M I I II I I I II I I I II I I I II I I II I I II I 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 



Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I t I II I I I I I I I I | | | I | | | I I II I I I I I I I I I I II I I I I I I I I II I I || | | | | | | | 
Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 728 

Qy 808 GTGATACCCTCCTCTGCAGAAAAAATTATGGATTTGATGGAGCAGCCAGGTAACACTGTT 8 67 

M II I I I I I I II I II I I I I I I I MINI I I I I I I I I I I I I I I I I I I II I I I II 
Db 729 GTGATACGCTCCTCTGCAGAAAA TAT G GACT T GAAG GAG C AGC C AGGT AAC ACT AT T 785 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I I I I I I II I I I I II I I I I I I I I II II 'I I II I I I I I I I I || I I I I I || | | | | | | 
Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 92 8 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I II I I I I I I II III || 
Db 84 6 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCT^ACA 905 

Qy 988 GT GT CATC CT CAGAAGGAACAATT GAAGAAACTTTAAAT GAAGCTT CTAAAGAGTT GCCA 1047 

M I I M I I I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I | | It 
Db 906 GT AT T AC C C ACT GAAGGAAC AC T T C AAGAAAATGT CAGT GAAG CT T CT AAAGAGGT CT CA 965 

Qy 104 8 GAGAGGGCAACAAAT C CATTTGTAAATAGAGATTTAGCAGAATTTT CAGAATTAGAATAT 1107 

I I I I I I I I I M M I I M I I II I I I II I I I I I I I I I II I II I I I | | | | | | 
Db 966 GAGAAG GCAAAAACT CT ACT C AT AGAT AGAGAT T TAAC AGAGT T T T C AGAAT T AGAAT AC 1025 

Qy 110 8 T C AGAAAT GG GAT CAT CTT T T AAAG GCT C C C CAAAAGGAGAGT C AG C CAT AT T AGT AGAA 1167 

M I I I I II I I II I I I I I II I I Ml I I I I I I I Ml || | M II I I I I II I 
Db 102 6 T CAGAAAT G GGAT CAT C GTT CAGT GT C T CT C CAAAAGC AGAAT CT G CC GTAAT AGT AG CA 1085 

Qy 1168 AAC AC TAAG GAAGAAGT AAT T GT GAG G AGT AAA GAC AAAGAGGAT T T AGT T T GT AGT 1224 

M I I I II I I I I I I I I I I I I I I I I I I M I I I I I I I I I | I I I II I 
Db 108 6 AAT C CT AG GGAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGTAAT 1145 

Qy 1225 G C AG C C CT T C AC AGT C CACAAGAAT C AC CT GT G G GT AAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I II I I I I II I I I I I I 

Db 1146 AACAT C C T T CAT AAT CAACAAGAGT T AC C T AC AGCT CT T AC T AAAT T G GT TAAAGAGGAT 1205 

Qy 1270 AGAGTT GT GT CT CCAGAAAAGACAAT GGACATTTTTAAT GAAAT GCAGAT GT CAGTAGTA 1329 

I I I I I I I I I I II I I I II I I I I I I I I I I I I II I I I I I I I II I I I 

Db 120 6 GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTT GCAGT GGAA 12 65 

Qy 1330 G C AC CT GT GAG GGAAGAGT AT GC AGAC T T TAAG C CAT T T GAAC AAG CAT G GGAAGT GAAA 1389 

I Ml I I I I I I | || I I I I I I I I I | | || I I I M I M I II I I I I I I II I I I I I 
Db 1266 GCT C CT AT GAGGGAGGAAT AT G C AGACT T CAAAC CAT T T GAGC GAGT AT GG GAAGT GAAA 1325 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I I I I I I I I II I I II I || || 

Db 132 6 GATA GT AAG GAAGAT AGT GAT AT GT TGGCTGCTG GAGGT AAAAT C GAG AG C AACT T G 1382 

Qy 1438 GAAAGTAAAGT GGACAGAAAAT GCTT GGAAGATAGCCT GGAGCAAAAAAGT CTT GGGAAG 14 97 

I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I Ml I II 
Db 1383 GAAAGTAAAGT GGAT AAAAAAT GT T T T G C AGAT AG C CT T GAGC AAACT AAT C AC GAAAAA 1442 

Qy 1498 GAT AGT GAAGG CAGAAAT GAGGAT GCTTCTTTCCC CAGT AC C C C AGAAC C T GT GAAG GAC 1557 

I I I II I I I I II I I I I I III I I I I I I I I II I I I I I I I I II I I I I Mill 
Db 1443 GAT AGT GAGAGT AGTAAT GAT GATACTT CTT T C CC CAGT ACGCCAGAAGGT ATAAAGGAT 1502 



Qy 1558 AGCT C CAGAGCAT AT AT TAC CT GT GCT T CCTT T A C CT C AG C AAC C GAAAGC AC C ACA 1614 

IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1503 C GT T C AG GAGC AT AT AT C AC AT GTGCTCCCTT T AAC C CAG C AG CAACT GAGAGC AT T GC A 1562 

Qy 1615 GCAAACACTTT C C CT T T GTTAGAAGAT CAT ACT TCAGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I I I I I I I I I I I I II I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I 
Db 1563 AC AAAC AT TTTTCCTTTGT TAG GAG AT C C TAC T T C AGAAAAT AAG AC C GAT GAAAAAAAA 1622 

Qy 1675 ATAGAAGAAAGGAAGGC C CAAAT TAT AACAGAGAAG AC T AGC C C C AAAAC GT C AAAT 17 31 

I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I | | | | 

Db 1623 AT AGAAGAAAAGAAGG C C CAAAT AGT AAC AGAGAAGAAT AC TAG C AC CAAAACAT CAAAC 1682 

Qy 1732 CCTTTCCTT GT AGC AGT AC AGGAT T CT GAGGC AGAT TAT GTT AC AAC AGAT AC CT TAT C A 1791 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I Ml || 
Db 1683 C CTTT T CTT GTAGCAGCACAGGATT CT GAGACAGATTAT GTCACAAC AGATAATTTAACA 1742 

Qy 17 92 AAGGT GACT GAG G C AGC AGT GT CAAAC AT G C C T GAAGGT C T GAC G CC AGATT T AGTT CAG 1851 

I I I I I I I I I I I II I I Ml I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I IN 
Db 1743 AAGGTGACTGAGGAAGTCGTGGCAAACATGCCTGAAGGCCTGACTCCAGATTTAGTACAG 1802 

Qy 1852 GAAGC AT GT GAAAGT GAACT GAAT GAAGC C AC AGGTACAAAGAT T GC T TAT GAAACAAAA 1911 

I I II II I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 1803 GAAGCAT GT GAAAGT GAAT T GAAT GAAGT T AC T GGT ACAAAGAT T GCT TAT GAAACAAAA 1862 

Qy 1912 GT G GACT T GGT C CAAACAT C AGAAG CT ATACAAGAAT C AC T T TAC C C C AC AGCAC AG CTT 1971 

I I I I I I I I I I I I I I II I II I I I I III I I I I I I I I I I II II I I I I I I I I I I I 
Db 18 63 AT GGACTT GGTT CAAACAT CAGAAGTTAT GCAAGAGT CACT CT AT C CT GCAGCACAGCTT 1922 

Qy 1972 T G C C CAT CAT TT GAGGAAG C T GAAGCAACT C C GT CAC C AGT T T T GC CT GAT AT T GT TAT G 2031 

I I I M I I I I I I I I I II I II I I I I I I I I I I || | | | | | | | | | | || | | | | | | | | || 
Db 1923 T G C C CAT C AT TT GAAGAGT C AGAAGCT AC T CCT T CAC C AGT T T T G C CT GAC AT T GT T AT G 1982 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 1983 GAAGC AC CAT T GAAT T CT G C AGT T C CT AGT GCT GGTGCTTCCGT GAT AC AG C C C AGCT C A 2042 

Qy 2092 T C C C CACT GGAAG C ACC T C CT C C AGT T AGT TAT GACAGT ATAAAG CT T GAGC CT GAAAAC 2151 

M Ml I I I I I M II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M 
Db 2043 T CAC CAT T AGAAG CT T CT T C AGT TAATT AT GAAAGCAT AAAAC AT GAGC CT GAAAAC 2099 

Qy 2152 CC C C CAC CAT AT GAAGAAGCC ATGAAT GTAGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 2100 CCCCCAC CAT AT GAAGAGGCCAT GAGT GT AT CACTAAAAAAAGTAT CAGGAATAAAGGAA 2159 

Qy 22 09 GGAATAAAAGAG CCT GAAAGT T T T AAT GCAGCT GT T CAGGAAAC AGAAG CT C CT T AT AT A 2268 

I Ml I I I I I I M II I M I II I I I I I I II I I I I I I I I II I I I I I I I II I I I I M I 

Db 2160 GAAAT T AAAGAG CCT GAAAAT AT TAAT G C AG CT C T T CAAGAAACAGAAGCT C CTT AT AT A 2219 

Qy 2269 T C CAT T GC GT GT GAT T TAAT T AAAGAAACAAAGCT CT CC ACT GAG C CAAGT C C AGATT T C 2328 

II Mill I I I II II I II II I II I I I II I II II I M MM III III MINI 

Db 2220 T CT AT T GCAT GT GAT T TAAT T AAAGAAACAAAGCT T T CT GC T GAAC CAGCT C CGGAT T T C 2279 

Qy 2329 T C TAAT TAT T C AGAAAT AGC AAAAT T C GAGAAGT C GGT G C C C GAAC AC GCT GAGCT AGT G 2388 

II I I I M I I II I I II I I II I I I I II II I I II I I I I II II II I I II I I 
Db 2280 T C T GAT TAT T C AGAAAT G G CAAAAGT T GAAC AGC C AGT GC C T GAT CAT T CT GAGC T AGT T 2339 

Qy 2389 GAGGATT CCT CACCT GAAT CT GAAC CAGTTGACTTATTTAGT GAT GATTC GAT TCCTGAA 244 8 



Db 234 0 GAAGAT T C CT C AC C T GAT T C T GAAC CAGT T GACT TATT T AGT GAT GAT T CAAT AC CT GAC 2399 

Qy 2449 GT C C C AC AAAC AC AAGAGGAGGCT GT GAT G C T CAT GAAGGAGAGT C T C AC T GA A 2502 

M I I I I I I I I I I I I I II I I I I I I I I I I I I || I | | | | | | | | | | | | | 
Db 2400 GT T C C ACAAAAAC AAGAT GAAACT GT GAT G C T T GT GAAAGAAAGT C T C AC T GAGACT T C A 2459 

Qy 2503 GT GT CT GAGACAGTAGC C CAGCACAAAGAGGAGAGACT TAGTGCCT CAC CT CAGGAGCT A 2562 

I I Ml I Ml I I I I I 1 I I I I I | | | Ml Ml | 

Db 2 4 60 T T T GAGT CAAT GAT AGAAT AT GAAAATAAGGAAAAACT CAGT GC T T T G C CAC C T GAG G GA 2519 

Qy 2563 GGAAAG C CAT AT T TAG AGT CT T T T C AG C C CAAT T T AC AT AGT ACAAAAG AT G C TGCA 2619 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 2520 G GAAAG C CAT AT T T GGAAT CTT T TAAGC T C AGT T TAG AT AAC ACAAAAG AT AC C CT GT T A 257 9 

Qy 2620 T CT AAT GAC AT T C C AAC AT T GAC C AAAAAG GAGAAAAT T T C TT T GC AAAT G GAAGAGT T T 2 67 9 

M MM II I I II II I II I II II II II II II I I I II I I M I Mill III I 
Db 2580 C C T GAT GAAGT T T CAAC AT T GAG CAAAAAG GAGAAAAT T C CT TT GC AGAT G GAGGAG CT C 2639 

Qy 2 68 0 AAT AC T GCAAT T TAT T CAAAT GAT GACT TACT TT C T T CT AAGGAAGACAAAATAAAAGAA 2 739 

I I I M I I I M II I I I II I I I I II II I I I II I M II II I II I I I I I I I I I I 
Db 264 0 AGT ACT G CAGT T TAT T CAAAT GAT GACT TAT T TAT T T CTAAGGAAGC AC AGAT AAGAGAA 2699 

Qy 274 0 AGT GAAAC AT T T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT T T GT C 2799 

I I M II I II I II I I I II I I I I I I I I I I | | || I I I I I I I I || || || || || 
Db 2 7 00 ACT GAAAC GT T T T CAG ATT CAT C T C CAAT T GAAAT TAT AGAT GAGT T C C CT AC AT T GAT C 2 759 

Qy 2800 AGT G CT AAAGAT GAT T C T C C T AAAT T AGC CAAGGAGT ACACT GAT CT AGAAGT AT C C 2856 

Ml Mill I I I I II I I I I II I II I M I Ml || | | | | | I I I I I I I II I I I 
Db 2760 AGTT CT AAAAC T GAT T CAT T T T CT AAAT TAG C CAGG GAAT ATACT GAC CT AGAAGT AT C C 2819 

Qy 2857 GACAAAAGT GAAAT T G C T AAT AT C CAAAG C G G GGCAGAT T CAT TGCCTTGCT T AGAAT T G 2916 

M I M I I M M II I II I I II M I I I I I I I I I M I I II II I I I I I I I 

Db 2 820 C ACAAAAGT GAAAT T GCT AAT G C C C C G GAT GGAGCT GGGT CAT T G C C T T GC AC AGAAT T G 2 879 

Qy 2917 C C CT GT GAC CTTTCTTT CAAGAAT AT AT AT C C T AAAGAT GAAG T AC AT GT T T C A 2970 

M I I II II I M II I I I I I I I I I I I I I I I I I | | | | | | || IN 

Db 2880 C C C CAT GAC CTTTCTTT G AAGAAC AT AC AAC C C AAAGT T GAAGAGAAAAT CAGT T T CT C A 2 939 

Qy 2 971 GAT GAAT T C T C C GAAAAT AG GT C CAGT GT AT CTAAGGC AT C CAT AT C G CCT T CAAATGT C 3030 

I M M M M II II I II I I I II I II I I I I II I I I I II I I I I 

Db 2 94 0 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2 999 

Qy 3031 TCTGCTTTG GAAC CT C AGAC AGAAAT GGGC AGC AT AGT TAAAT C CAAAT C ACT T AC GAAA 3090 

M II I II II I I I II I I I I I I I I I I II I I I I I I I I I I I I Ml I I I I 
Db 3000 TCTGCTTT GGC CAC T CAAGCAGAGAT AGAGAG CAT AGT T AAAC C CAAAGT T C T T GT GAAA 3059 

Qy 3091 GAAG CAGAGAAAAAAC T T C CT T CT GACAC AGAGAAAGAGGACAGAT C C CT GT C AGC T GT A 3150 

M I M II II II II II I I II I I I M II II I II I I II M II I I II I II III II 
Db 3060 GAAGC T GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAGGACAG AT CAC CAT C T GC T AT A 3119 

Qy 3151 T T GT CAG CAGAG C T GAGTAAAACT T CAGT T GT T GAC CT C C T C TACT G GAGAGACAT T AAG 3210 

M M I I M I M II II II I I II I II II I I I I I II I I II I M M I | || | | | | || || || || 
Db 3120 T T T T CAG CAGAG C T GAGTAAAACT T CAGT T GT T GACCT C CT GT ACT GGAGAGACAT T AAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I I M II II I II I I I I I I I I | | | | | | I I I II I || || | || || || | | | | | | | | | | I | | 



Db 



318 0 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3239 



Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I II I I I I I I I I I II I I I I I M I I I I II I I I I II I Mill I I I I I I I I I I I I 
Db 3240 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3299 

Qy 3331 AT AT AT AAGGGC GT GAT C C AG GCT AT C C AGAAAT C AGAT GAAG GC C AC C CAT T C AG G GC A 3390 

I I I I I I I I I I I I I I I I I I I I I II I I I I M I M I I I I I I I I I I I I I I I I I I I || | | | | 
Db 3300 AT AT ACAAGG GT GT GAT C CAAG C TAT C C AGAAAT CAGAT GAAGGC C AC C CAT T C AGG GC A 3359 

Qy 3391 TAT T T AGAAT C T GAAGT T GC T AT AT C AGAGGAAT T GGT T CAGAAAT AC AGTAAT T CT G CT 3450 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I M II I I I I I I I I I I I II I I I 
Db 3360 TATCTGGAATCTGAAGTTGCTATATCTGAGGAGTTGGTTCAGAAGTACAGTAATTCTGCT 3419 

Qy 3451 CT T GGT CAT GT GAACAGC ACAATAAAAGAACT GAGGC G GCTT T T C T T AGT T GAT GAT T T A 3510 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I II I I I II I II I I I I II I I I I I M II 
Db 342 0 CT T G GT CAT GT GAAC T G CAC GAT AAAG GAACT CAGG C GC CT C T T CTT AGT T GAT GAT T T A 347 9 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I I I I I I I I I M I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 3480 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

Qy 3571 AAT G GT CT GAC AC TACT GAT T T T AGCT CT GAT C T C ACT CTT C AGT AT T C C T GT TAT T TAT 3630 

I M I I I ! I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 3540 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3631 GAACGGCAT CAGGT G CAGAT AGAT CATTAT CTAGGACTT GCAAACAAGAGT GTTAAGGAT 3690 

M I I I I I I M I I I I I I I I I I I I II I I I I I I I I II I I I II I I I I I I I I I I I I I III 
Db 3600 GAACGGCATCAGGCACAGATAGATCATTATCTAGGACTTGCAAATAAGAATGTTAAAGAT 3659 

Qy 3691 G C CAT G G C CAAAAT C CAAGC AAAAAT C C C T GGAT T GAAG C GCAAAGC AGA 3740 

II M I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II 

Db 3660 GCTATGGCTAAAATCCAAGCAAAAATCCCTG GAT TGAAGC GCAAAGC TGA 37 09 
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QY 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I II Ml I I III M I I I I I I I I I I II I I I I I I I II I I I I I I I I II M I I 
Db 16 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCAACCCCCACAACCGCCCGCGGCT 75 

QY 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

I I I I I I M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I 
Db 76 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 134 

Qy 2 53 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I II I I I I I I I | | | 
Db 135 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 191 

Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GGAGC C C GAGGAC GAGGAGGAC GAGGAGGAG 372 

I I I I I I I I I I I M I I I I I I II I I I I I I I I I I I I I I I I I II I I I I I II II Ml 
Db 192 CAGCCCGCGTTCAAGTACCAGTTCGTGAGGGAGCCCGAGGACGAGGAG GAAGAAGAG 24 8 

Qy 373 GAGGAGGAC GAGGAG GAG GAC GAC GAGGAC C T AGAGGAAC T GGAGGT GCT G GAGAGGAAG 432 

I I I I I I I I I I I I I I M I I Mill I I I I I I I I I I II I I I I I I I I || M || || | | | 
Db 249 GAGGAGGAAGAGGAGGACGAGGACGAAGACCTGGAGGAGCTGGAGGTGCTGGAGAGGAAG 308 

Qy 4 33 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

M II I I I I I I I I II I I I I I I I I I I I I | | | | | | || | | | | | || | | | | 

Db 309 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 368 

Qy 487 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

I M I I I I I I I I I I I II II I I I I I || | || Ml 

Db 369 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 42 8 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I I I I I I M I I I I I I I I I I I I I I I I I I I I | | | | | M 

Db 42 9 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 488 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I M M II I I I II I I II II I I I I II II II I | | || || | || | | | I I I | | | | II I I I 
Db 4 89 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 548 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I M I M I I I I II II I I I I I II I I I I I M I MINIM III III 
Db 549 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 608 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I M M M I I I I I I I I I II I I II I I I I I I I I I II I I I 
Db 609 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 668 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I M M II I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I II II I I I M I I 

Db 669 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 728 



QY 808 GT GAT AC CCTCCTCTG CAGAAAAAATT AT G GAT T T GAT G GAG CAGC C AG GTAAC AC T GT T 8 67 

M I M I I I M II I I I I I I I I 

Db 729 GTGATACGCTCCTCTGCAGAAAA TAT G GAC T T GAAG GAG CAGC C AG GTAAC ACT AT T 785 

QY 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

HI I Mill II I I I II I I | || 1 

Db 786 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 845 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

II I II I I I I I I I I I I I 1,111 I II I I I I I II I I I I I I I I I I I I II Ml || 
Db 846 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 905 

Qy 98 8 GT GT CAT C CT C AGAAG GAAC AAT T GAAGAAACT T T AAAT GAAGCT T CT AAAGAGT T GC C A 1047 

M I I M I I I I I I I I I I II I II I I I I I I I I I I II I I I I I I I I I | | || 
Db 906 GTATTACCCACTGAAGGAACACTTCAAGAAAATGTCAGTGAAGCTTCTAAAGAGGTCTCA 965 

Qy 104 8 GAGAGGGCAACAAAT C CATTT GTAAAT AGAGATTTAGCAGAAT TTT CAGAATTAGAATAT 1107 

I I I I I I I I I M M I I M I I I I I II I I I I I I I I I I I I I I I I I I I I I M I I 
Db 966 GAGAAGGCAAAAAC T CT ACT C AT AGAT AGAGAT TTAAC AGAGT TTT C AGAATT AGAAT AC 1025 

Qy 1108 T C AGAAAT GGGAT CAT C T T T T AAAGG C T C C C CAAAAG GAGAGT CAGC CAT AT T AGT AGAA 1167 

I I M I I I I I I II I I I I I || I I I M I II I I I I III II III I I I I I I M I 
Db 1026 T CAGAAAT GGGAT CAT C GT T CAGT GT CT C T C CAAAAG CAGAAT CT GC C GT AAT AGT AGC A 108 5 

Qy 1168 AACACTAAGGAAGAAGTAATTGTGAGGAGTAAA GACAAAG AG GAT T T AGTT T GT AGT 1224 

M I I I I I I I I I I I II I I I I I I I I I I M I I I I I I I I I I II I I I I 
Db 10 8 6 AAT C CT AG GGAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 114 5 

Qy 1225 G CAGC C CT T C AC AGT C C ACAAGAAT C AC CT GTGGGTAAAGAAGAC 1269 

I I I M I I I I I II I I I I I I II I II I I I I I I II 

Db 114 6 AACAT C C T T CAT AAT CAACAAGAGT T AC C T AC AGC T CT T AC TAAAT T G GT T AAAGAGGAT 1205 

Qy 12 7 0 AGAGTTGTGTCT CCAGAAAAGACAAT GGACATTTTTAATGAAATGCAGATGTCAGTAGTA 132 9 

I I I I I I I I I I I I I I I I I Ml I I I I I I I I I I I I I I I I I I I I I I I 

Db 1206 GAAGT T GT GT C T T CAGAAAAAGCAAAAGAC AGT T T TAAT GAAAAGAGAGT T GCAGT GGAA 1265 

Qy 1330 GCACCTGTGAGGGAAGAGTATGCAGACTTTAAGCCATTTGAACAAGCATGGGAAGTGAAA 1389 

M I I I I I I I I I I II I I I I I I I I I I I II II I I I I I I I || | | | | | || | | | | M 
Db 1266 GC T C CTAT GAGG GAGGAAT AT GC AGAC T T CAAAC CAT T T GAGC GAGT AT GG GAAGT GAAA 132 5 

Qy 1390 GAT AC T TAT GAG G GAAGT AGGGAT GTGCTGGCTGC TAGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 

Db 1326 GATA GTAAG GAAGAT AGT GAT AT GT TGGCTGCT GG AGGTAAAAT CGAGAGCAACT T G 1382 

Qy 14 38 GAAAGTAAAGTGGACAGAAAATGCTTGGAAGATAGCCTGGAGCAAAAAAGTCTTGGGAAG 14 97 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I II III I II 
Db 1383 GAAAGTAAAGTGGATAAAAAATGTTTTGCAGATAGCCTTGAGCAAACTAATCACGAAAAA 1442 

Qy 1498 GATAGTGAAGGCAGAAATGAGGATGCTTCTTTCCCCAGTACCCCAGAACCTGTGAAGGAC 1557 

M I I I I I I I II I I I I I Ml | | | | | || M | | | | | | | | | | | | | | | | | | | | 
Db 14 43 GAT AGT GAGAGT AGTAAT GAT GAT AC TTCTTTCCC CAGT AC GC CAGAAGGT AT AAAGGAT 1502 

Qy 1558 AG CT C CAGAG CAT AT AT T AC CT GT GCTTCCTT T A CCTCAGCAACCGAAAGCACCACA 1614 

I I I I I I I I II II I I I I I I I I I I I II I I I I I I I I II I I I I I I II 
Db 1503 C GTT C AGGAGC AT AT AT C AC AT GTGCTCCCTT T AAC C C AG CAG CAAC T GAGAG CAT T G C A 1562 



Qy 1615 GCAAAC ACT T T C C CT T T GT T AGAAGAT C AT ACTT CAGAAAAT AAAAC AGAT GAAAAAAAA 1674 

M I I I I I II I I I I I I I I I I I I I I I I I I li II I I I I I I I I II I I I I I I I I I I I I 
Db 1563 ACAAAC ATT T T T C CT T T GT T AGGAGAT C CT ACT T CAGAAAAT AAGAC C GAT GAAAAAAAA 1622 

Qy 1675 AT AGAAGAAAG GAAGGC C CAAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

I I I I I I I I I I I I I I I II I I I I I II I I II I I I I I I I I I I I I I I II I I I I I I I 
Db 1623 AT AGAAGAAAAGAAGGCC CAAAT AGTAACAGAGAAGAAT ACT AGCACCAAAACAT CAAAC 1682 

Qy 1732 CCTTTCCTT GT AGCAGT AC AG GAT T CT GAGG CAGAT TAT GT T ACAAC AG AT AC C T TAT C A 1791 

I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I III II 
Db 1683 CCTTTTCTT GT AG CAG C ACAGGAT T CT GAGAC AGAT TAT GT C AC AAC AGATAAT T TAAC A 1742 

Qy 1792 AAG GT GAC T GAG G CAGC AGT GT CAAAC AT G C C T GAAGGT C T GAC GC CAGAT T TAGT T CAG 1851 

I I I I I I I M I I I I II III II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I M 
Db 1743 AAGGT GACT GAG GAAGT C GT G GCAAAC AT GC CT GAAG GC CT GACT CC AGAT T TAGT AC AG 1802 

Qy 1852 GAAGC AT GT GAAAGT GAACT GAAT GAAG C C AC AGGTACAAAGAT T GCTT AT GAAACAAAA 1911 

I M I I I II I I I I I I I I II I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 18 03 GAAGCAT GT GAAAGT GAATT GAAT GAAGTTACT GGTACAAAGATT GCTT AT GAAACAAAA 1862 

Qy 1912 GT GGACT T GGT CCAAACATCAGAAGCT ATACAAGAATCACTTTACCC C ACAGCACAGCT T 1971 

I I I M I I I I I I I I I II I I I I I II I II I I I I I I I I I I II II M I I I I I I I I I 
Db 1863 AT G GACT T G GT T CAAACAT C AGAAGT T AT G CAAGAGT CACT CT AT CCT G CAG CAC AGCT T 1922 

Qy 1972 T GCC CAT CAT T T GAGGAAGCT GAAG CAAC T C C GT CAC C AGTT T T GC C T GAT ATT GT T AT G 2 031 

I M I II I I I I I I I I II I I I II I I I I I I I I I I! I I I I I I I I I I I I I I I I I I I I I 
Db 1923 T G C C CAT CAT T T GAAGAGT C AGAAG CT ACT C C T T CACCAGT T T T GC CT GAC AT T GT TAT G 19 82 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2 091 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1983 GAAGCACCATT GAAT TCTGCAGTT CCT AGT GCT GGT GCTT CCGT GAT ACAGCCCAGCTCA 2 042 

Qy 2092 TCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGCCTGAAAAC 2151 

M Ml I I I I I I I I I I I I I I I I I I II I II I I II I I I I I I I II II I I I I 
Db 2043 T CAC CAT TAG AAG CT T CTT C AGT T AAT TAT GAAAGCAT AAAAC AT GAGC C T GAAAAC 2 099 

Qy 2152 C C C C CAC CAT AT GAAGAAGC CAT GAAT GT AG CACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I I I I M I I I I I I I I I I I I I I II MM MM I II II I I I II II II I I I 
Db 2100 C C CC CAC CAT AT GAAGAGGC C AT GAGT GT AT CACT AAAAAAAGT AT C AGGAATAAAGGAA 2159 

Qy 2209 G GAATAAAAGAGC CT GAAAGTT T T AAT GC AGCT GT T CAGGAAAC AGAAG C T C CT TAT AT A 2268 

I M I I M II I I I II I I I I II II II I II I I MM II II II II I I I II I II II I I I 

Db 2160 GAAAT T AAAGAGC CT GAAAAT AT TAAT G C AGC T CTT CAAGAAAC AGAAGCT CCT TAT AT A 2219 

Qy 22 69 T C CAT T GC GT GT GAT T TAAT TAAAGAAAC AAAGCT CT C CACT GAG C CAAGT C CAGATT T C 2328 

II I I I M M M I I II I II II II I M II II II II II II I I III III II I I I I 

Db 2220 T C TAT T GC AT GT GAT T TAAT T AAAGAAACAAAG CTTTCTGCT GAAC C AGC T C C GGATT T C 2279 

Qy 232 9 T C TAAT TAT TCAGAAATAGCAAAATTCGAGAAGTC GGT GCCC GAAC AC GCT GAGCTAGTG 23 88 

I M II I I I I II I II II M II II I I I II I I I II I II II I I II I II I II 
Db 2280 TCTGATTATTCAGAAATGGCAAAAGTTGAACAGCCAGTGCCTGATCATTCTGAGCTAGTT 2339 

Qy 23 8 9 GAGGAT T C C T CAC C T GAAT CT GAAC C AGT T GACT TAT T TAGT GAT GAT T C GAT T C CTGAA 2448 

M I M II II I I II I II I II I II II M I I II M II II II II II I II I II II Mill 
Db 234 0 GAAGAT T C CT CAC CT GAT T C T GAAC C AGT T GACT TAT T TAGT GAT GAT T C AAT AC CT GAC 2399 

Qy 2449 GTCCCACAAACACAAGAGGAGGCTGT GAT GCT CAT GAAGGAGAGTCT CACT GA A 2502 



Db 2400 GT T C CACAAAAAC AAGAT GAAACT GT GAT GC TT GT GAAAGAAAGT C T C ACT GAGAC TT C A 2459 

Qy 2503 GT GT CT GAGAC AGT AG C C CAGC ACAAAGAGGAGAGAC T T AGT GC CT C AC CT C AG GAGCT A 2562 

I I I I I I I I I I I I I I I I I I I I I I I III IM I 

Db 2460 T T T GAGT C AAT GAT AGAAT AT GAAAATAAGGAAAAACT C AGT G C T T T G C C AC CT GAGGGA 2519 

Qy 2563 G GAAAGC CAT AT T T AGAGT CT T T T C AG C C CAAT TT AC AT AGT ACAAAAGAT G C TGCA 2619 

I I M II I I II I I I I I I I I I I I | | | | | | || | | | | | | | || | | | | | | | 
Db 2520 G GAAAGC CAT AT T T GGAAT CT T T T AAGC T C AGT TT AGAT AAC AC AAAAGAT AC C C T GT T A 2579 

Qy 2620 T CT AAT GACAT T C C AAC AT T GAC C AAAAAG GAGAAAAT T T C T T T G C AAAT G GAAGAGT T T 2 67 9 

I I M I I I I I M I I I I I I I I I II I I I I I I I M I I I I I I I I I Mill II I I 
Db 2580 C CT GAT GAAGTT T CAACAT T GAGCAAAAAG GAGAAAATT C C T T T G CAG AT G GAG GAG CT C 2639 

Qy 268 0 AAT ACT GCAATT T AT T C AAAT GAT GACT T ACT T T C T T CTAAG GAAGAC AAAAT AAAAGAA 2739 

I I I I I I I I I M I II I I I I I I I I I I I I I I || I I I I I I I I I I | | | | M I I I I 
Db 264 0 AGT ACT GCAGTTTATT CAAAT GAT GACTTATT TATTT CTAAGGAAGCACAGATAAGAGAA 2699 

Qy 274 0 AGT GAAAC ATTT T CAGAT T CAT C T C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GT T T GT C 2799 

I I I I I M I I I I I II I I I I I Mill II I II I M I I II II II II || 

Db 2700 AC T GAAAC GT T T T CAGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C CT AC AT T GAT C 2759 

Qy 2800 AGTGCTAAAGATGATTC T C CTAAAT TAG C CAAGGAGT ACACT GAT C T AGAAGTAT C C 2856 

Ml I I I I I I I I I I I I II I I I M I I I || | | | | | | | || | | | | | | | M | | | | 
Db 27 60 AGT T C T AAAAC T GATT C AT T T T CTAAAT TAG C C AGGGAAT AT ACT GAC C T AGAAGTAT C C 2819 

Qy 2857 GACAAAAGT GAAAT T GCTAATAT C CAAAG C GGGG CAGAT T CAT TGCCTTGCT T AGAAT T G 2916 

I I N I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2 820 CACAAAAGTGAAATTGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 2879 

Qy 2917 C C CT GT GAC CT T T CT T T CAAGAAT AT AT AT C C T AAAGAT GAAG T AC AT GT T T C A 2 97 0 

III M I I M I I I I I I I I I I I I I I I I I I M I I I I I I I | | M I 

Db 28 8 0 C C C CAT GAC CT T T CT T T GAAGAACAT ACAAC C CAAAGT T GAAGAGAAAAT C AGT T T CT CA 2 93 9 

Qy 2971 GAT GAATT CTC CGAAAATAGGT CCAGT GTAT CTAAGGCAT C CATAT CGC CT T CAAAT GT C 3030 

M I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 

Db 2940 GAT GACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTC CAGAT GTT 2999 

Qy 3031 TCTGCTTTGGAACCTCAGACAGAAATGGGCAGCATAGTTAAATCC7WVTCACTTACGAAA 3090 

I I I I MM I I II I I I I II I II || || I I | | | | | IN | | | | 

Db 3000 TCTGCTTTGGC CAC T C AAGC AGAGAT AGAGAGC AT AGT TAAAC C CAAAGTT CTT GT GAAA 3059 

Qy 3091 GAAGCAGAGAAAAAACTT CCTTCTGACACAGAGAAAGAGGACAGAT CCCTGTCAGCTGTA 3150 

I I I I I I I I I I M M II I I I I I I II Mill I I I I I II I II M I I I I I Ml II 
Db 3060 GAAGCT GAGAAAAAACTTCCTT CCGATACAGAAAAAGAGGACAGATCACCATCTGCTATA 3119 

Qy 3151 T T GT CAG C AGAGC T GAGT AAAACT T C AGTT GT T GAC CT C CT CT AC T G G AGAG AC AT TAAG 3210 

M M I I M I I II I I I I II I I I I I I I II I I I I I I I I I I I I I IN || | | | | || | | 

Db 3120 T T T T CAG C AGAG CT GAGT AAAACT T C AGTT GT T GAC CT C C T GT ACT GGAGAGAC AT TAAG 3179 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

M M I I II II I II I I II I I I II I I I I I I I I I I I | I | | | || || | | | M | | | | | | M 
Db 3180 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3239 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

Mill II I I II I II I I II I I II II I I II I I I I II I I || I I I | | || | | | | | | | I | | 



Db 3240 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3299 

Qy 3331 AT AT ATAAGG GCGT GAT C CAG G CT AT C C AGAAAT CAGAT GAAG G C CAC C CAT T C AG GGC A 3390 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I I I 
Db 3300 AT AT AC AAG GGT GT GAT C CAAGC T AT C C AGAAAT CAGAT GAAGG C CAC C CAT T C AG GG C A 3359 

Qy 3391 TAT T T AGAAT CT GAAGTT GC T AT AT C AGAG GAAT T G GTT C AGAAAT AC AGTAAT T CT GCT 3450 

Ml I I I M I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I II II I I I I I I I 
Db 3360 TAT CT GGAAT CT GAAGTT GCT AT AT CT GAGGAGT TGGTT CAGAAGTACAGTAATT CTGCT 3419 

Qy 3451 CT T GGT CAT GT GAAC AGC ACAATAAAAGAAC T GAG GC GGCT T T T CT T AGT T GAT GAT T T A 3510 

I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I 
Db 342 0 CT T GGT CAT GT GAAC T G CAC GATAAAG GAACT C AGG C GC CT C T T CT T AGT T GAT GAT T T A 3479 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I II I I 
Db 3480 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3539 

Qy 3571 AAT GGT CT GACAC TACT GAT TT T AGC T CT GAT CT CAC T CTT CAGT AT T C C T GT TAT T TAT 3630 

I I I I I I I I I I I I I M I I II I I I I II I I I II II I I II I I I I I I I II I I I I I I I I I I I 

Db 354 0 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3599 

Qy 3 631 GAAC G GCAT C AGGT G C AGAT AGAT CAT TAT C TAG GACT T GCAAACAAGAGT GT T AAGGAT 3690 

M I II I I I I I I I I I I I II I I I I I I I M I I || || I I | | | | | | | | | | | | | | | | | | M 
Db 3600 GAAC GGC AT CAG GCACAGAT AGAT CAT TAT CT AGGAC T T GC AAAT AAGAAT GT T AAAGAT 3659 

Qy 3691 GC C AT GGC CAAAAT C C AAGCAAAAAT C C CT G GAT T GAAG C GCAAAG CAGA 3740 

II I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I I || 

Db 3660 G CT AT GGC TAAAAT C CAAGCAAAAAT C CCT G GAT T GAAGCG CAAAGCT GA 37 09 
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Query Match 62.6%; Score 2343.6; DB 15; Length 4632; 

Best Local Similarity 81.3%; Pred. No. 0; 



Matches 3017; Conservative 0; Mismatches 574; Inciels 119; Gaps 21 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I | | M 
Db 23 CTCGGCTCAGTCGGCCCAGCCCCTCTCAGTCCTCCCCT^CCCCCACAACCGCCCGCGGCT 82 

Qy 194 CTGAGGAGAAGCGGC-CCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCC 252 

M I I I I II I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 83 CTGAGACGCGGCCCCGGCGGCGGCGGCAGCAGCTGCAGCATCATC-TCCACCCTCCAGCC 141 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 142 AT G GAAGAC CT GGAC C AGT CTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 198 

Qy 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC GGAG C C CGAG GAC GAGGAGGAC GAGGAGGAG 372 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I 
Db 199 C AGC C C G C GT TCAAGT AC CAGT T C GT GAG GGAGC C CGAGGAC GAG GAG GAAGAAGAG 255 

Qy 373 GAG GAGGAC GAGGAGGAG GAC GAC GAG GAC CT AGAGGAAC T GGAGGT G CT G GAGAGGAAG 432 

I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I | | | | 
Db 256 GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCTGGAGGTGCT GGAGAGGAAG 315 

Qy 4 33 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I I I I I I I I I I I I I I I I I I I lull I I I I I I I I I I I I I I I I I I I 

Db 316 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 375 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I III 
Db 37 6 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 4 35 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I 

Db 436 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 495 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I I I I I I I M I M I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I 
Db 4 96 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 555 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M I II I I I I I I I II II I I I II II III III II I I I I I I III III 
Db 556 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 615 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I II I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I 
Db 616 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 675 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 676 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 735 

Qy 8 08 GT GAT AC CCTCCTCT GC AGAAAAAAT TAT GGATT T GAT GGAGC AG C C AG GTAAC AC T GT T 867 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II 
Db 736 GT GAT AC GCTCCTCTG C AGAAAA TATGGACTTGAAGGAGCAGCCAGGTAACACTATT 792 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I II I I I I I I I II I I I I 
Db 793 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 852 



Qy 928 CTATCTCCTCTCTCJ\ACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I II 
Db 853 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 912 

Qy 988 GT GT CAT C CT CAGAAGGAACAAT T GAAGAAACT T T AAAT GAAG CT T C T AAAGAGT T GC C A 1047 

M I I M I I I II I I I I I II I I I I I I I I I I I I II I I I I I I I II I I I II 
Db 913 GT AT T AC C CACT GAAG GAAC ACT T CAAGAAAAT GT CAGT GAAG CTT CT AAAGAG GT CT C A 972 

Qy 104 8 GAGAGG G CAACAAAT C C ATT T GTAAAT AGAGAT T TAGC AGAAT T T T C AGAAT T AGAAT AT 1107 

I I I I I I I I I M M I I II I I I I I I I I I I I I I I I I I II I I M I I I I I I I I I 

Db 973 GAGAAGGCAAAAACT CT ACT CAT AGAT AGAGAT T T AAC AGAGT T T T C AGAAT T AGAAT AC 1032 

Qy 1108 TCAGAAAT GGGAT CAT CTTTTAAAGGCT CCC CAAAAGGAGAGT CAGCCAT ATT AGTAGAA 1167 

M I I I I I I I I I I I I I I I II I I III I I I I I I I I I I II III I I I I I I | | | 
Db 1033 T C AGAAAT G GGAT CAT C GTT CAGT GT C T CT C CAAAAGC AGAAT CT GC C GT AAT AGT AGC A 1092 

Qy 1168 AACACTAAGGAAGAAGTAATTGTGAGGAGTAAA GACAAAGAGGATTTAGTTTGTAGT 1224 

II I I I I I I I I M I M I I I I I I I I I I II MM I I I M I I 

Db 1093 AAT C CT AGG GAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1152 

Qy 1225 GC AGC C CTT C ACAGT C C ACAAGAAT C AC C T GTGGGTAAAGAAGAC 1269 

I I I I M I I I I II II I I I II I I II II I II I II 

Db 1153 AACAT C C T T CAT AAT CAACAAG AGT TACCT ACAGCT CT TACT AAAT T GGT TAAAGAGGAT 1212 

Qy 1270 AGAGT T GT GT C T C C AGAAAAGACAAT GGAC ATT T T T AAT GAAAT G C AGAT GT CAGT AGT A 1329 

I M I I I I M I II I II II III II II M II II M II I I I I II I I I 

Db 1213 GAAGT T GT GT CTT CAGAAAAAGCAAAAGAC AGT T T TAAT GAAAAGAGAGT T GC AGT GGAA 1272 

Qy 1330 GC AC CT GT GAG GGAAGAGT AT GCAGACT T T AAG C CAT T T GAACAAGCAT G G GAAGT GAAA 1389 

M M I I I II II I II II II I I II I II II M I I II II I II II M II II I M II 
Db 1273 GCT C C TAT GAGGGAGGAAT AT GCAGAC T T CAAAC CAT T T GAG C GAGT AT GGGAAGT GAAA 1332 

Qy 1390 GAT ACT TAT GAGGGAAGTAGGGAT GT GCT GGCT GCTAGAGCT AATGTG 1437 

II II I II I I I II I II II II I I II II II I I || || 

Db 1333 GATA — - GTAAGGAAGAT AGT GAT AT GTT GGCT GCT GGAGGTAAAAT CGAGAGCAACTT G 1389 

Qy 1438 GAAAGTAAAGT GGACAGAAAAT GC T T GGAAGAT AG C CT GGAGCAAAAAAGT CT T GGGAAG 1497 

M II I I II I II I II I II I II I I I I II I II II II || || I I I III I II 
Db 1390 GAAAGTAAAGT GGAT AAAAAAT GT T T T GCAGAT AGC C T T GAGCAAAC TAAT C AC GAAAAA 144 9 

Qy 1498 GAT AGT GAAGGCAGAAAT GAGGAT GCTT CTTT C C C CAGTACCC CAGAACCT GT GAAGGAC 1557 

I I M II II I II II I II Ml I I I II II I I I I I II I I II I II I I I II I I I 
Db 1450 GAT AGT GAGAGT AGTAAT GAT GAT ACT T CTT T C C C CAGT AC GC C AGAAGGT AT AAAGGAT 1509 

Qy 1558 AGCT C C AGAGC AT AT AT T AC CTGTGCTTCCTT T A C CT C AG CAAC C GAAAG CAC CAC A 1614 

Ml I I I II II II I II II II II II II I I I I I I I I II I I I I I I II 
Db 1510 C GT T CAGGAGCAT AT AT CAC AT GT GC T C C CT T TAAC C C AG CAGCAACT GAGAG CAT T GC A 1569 

Qy 1615 GCAAACACT TT C CCT TT GTT AGAAGAT CATACTT CAGAAAATAAAACAGAT GAAAAAAAA 1674 

II I I I I III II I I I I I I I I I I I II II I I I I I I I I II M I II II I II M M I II 
Db 157 0 ACAAAC AT TTTTCCTTTGT T AGGAGAT C CT ACT T CAGAAAAT AAGAC C GAT GAAAAAAAA 1629 

Qy 1675 AT AGAAGAAAG GAAG G C C C AAAT T AT AAC AG AGAAG ACT AG CCC C AAAAC GT C AAAT 1731 

I II II I II II I I I II II I I II I II I I I II II II I II II I I I I II I I I II I I 
Db 1630 AT AGAAGAAAAGAAG G C C CAAAT AGT AAC AGAGAAGAAT ACT AGCAC CAAAAC AT CAAAC 1689 



Qy 1732 C CT T T C CTT GT AGCAGT AC AGGAT T CT GAG G C AGAT TAT GT T ACAAC AGAT AC CT T AT C A 1791 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I M I II 
Db 1690 C CTTTT CTT GTAGCAGCACAGGAT T CT GAGAC AGAT TAT GT CACAACAGATAATTTAACA 174 9 

Qy 1792 AAGGT GACT GAGGCAG CAGT GT C AAAC AT G C C T GAAG GT C T GAC G C C AGAT T T AGT T C AG 1851 

I I M I I I I I I II I || M I I I I M I I I I | | | || | | | | | | | | | | | | I I I I I I III 
Db 1750 AAGGT GACT GAGGAAGT C GT GG CAAAC AT GC C T GAAG GC C T GACT C C AGAT T T AGT ACAG 1809 

Qy 1852 GAAGCAT GT GAAAGT GAACT GAAT GAAGCCACAGGTACAAAGATT GCTTAT GAAACAAAA 1911 

I I M I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I 

Db 1810 GAAGCAT GT GAAAGT GAATT GAAT GAAGTTACT GGTACAAAGATT GCTTAT GAAACAAAA 18 69 

Qy 1912 GT G GACT T GGT C CAAACATC AGAAG C T AT ACAAGAAT CAC T T T AC C C C AC AGC ACAGCT T 1971 

I I I M I I I I I I I I I I I I I II I II III I I I I I I I I I I II || I I I I I I I I I I I 
Db 1870 AT GGACT T GGT T CAAAC AT C AGAAGT T AT GCAAGAGT CAC T C TAT C C T GC AG CAC AGC T T 192 9 

Qy 1972 T GC C CAT C ATT T GAGGAAGC T GAAGCAACT C C GT CAC C AGTT T T GCC T GAT AT T GT T AT G 2031 

I I I I I I I I II I I I I II I I I I I I III 1 ! I I | | | | | || | | | | | | | | | | | | | | | | | 
Db 1930 T GC C CAT CAT T T GAAGAGTC AGAAGCT AC T C CT T CAC CAGT T T T GCCT GAC AT T GT T AT G 1989 

Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 1990 GAAGCACCATT GAATT CTGCAGTTCCTAGTGCT GGT GCTTCCGTGATACAGCCCAGCTCA 204 9 

Qy 2092 T C CC C AC T GGAAGC AC C T C C T C CAGT T AGT TAT GACAGT ATAAAGC T T GAGC CT GAAAAC 2151 

M III I I I I I II II I I I I I I I I I I I I II II I I I I I I I I I I I I I I I I I 
Db 2050 T CAC CAT T AGAAG CTT CTT CAGTTAATTATGAAAGCATAAAACATGAGCCT GAAAAC 2106 

Qy 2152 CCCCCAC CAT AT GAAGAAGC CAT GAAT GTAGCACT AAAAGCTTTGGGAACAAAGGAA 2208 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | M I I I I 
Db 2107 C C C C CAC CAT AT GAAGAGG C CAT G AGT GT AT C ACTAAAAAAAGT AT CAG GAATAAAGGAA 2166 

Qy 2209 GGAATAAAAGAGC CT GAAAGTT T T AAT G C AGC T GT T CAGGAAAC AGAAG C T C CT T AT AT A 2268 

I Ml I I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 2167 GAAATTAAAGAGCCT GAAAATAT TAAT GCAGCT CT T CAAGAAACAGAAGCT CCTT ATATA 2226 

Qy 2269 T C CAT T GC GT GT GAT T TAAT T AAAGAAACAAAGC T C T C C ACT GAGC CAAGT C C AGATT T C 232 8 

M I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I III I M I I I I I I 
Db 2227 T C T ATT G CAT GT GAT T TAAT TAAAGAAACAAAG CT T T CT GCT GAAC C AGCT C C GGATT T C 22 86 

Qy 232 9 T CT AAT TAT T C AGAAAT AG CAAAAT T C GAGAAGT C GGT GC C C GAAC ACGC T GAG CT AGT G 2388 

M I I I I I I I I II I I I I I I I I I I I II || I I I I I I || | | | | | | | | | | M 
Db 2287 T C T GAT TAT T C AGAAAT GGCAAAAGT T GAAC AG C CAGT GCCT GAT CAT T CT GAGC T AGT T 2346 

Qy 2389 GAGGAT T C C T CAC CT GAAT CT GAAC CAGTT GACT TAT T T AGT GAT GAT T C GAT T C C T GAA 244 8 

II I I I I I I I I I I I I I I I I I I I I II I I I M I I I I I I I I I I I II | | || | | || | | | | | 

Db 2347 GAAGAT T C CT C AC CT GAT T CT GAAC CAGT T GAC T TAT T T AGT GAT GAT T CAAT AC CT GAC 24 06 

Qy 2449 GT C C CACAAACAC AAGAGGAGGCT GT GAT GCT CAT GAAG GAGAGT C T C ACT GA A 2 502 

M I II I I I I I I I I I I I I I I I II I I I I I I I I I II I I I I II I I I I I I 
Db 24 07 GTT CCACAAAAACAAGAT GAAACT GT GAT GCTT GT GAAAGAAAGT CT CACT GAGACTT CA 24 66 

Qy 2503 GT GT CT GAGAC AGT AG C C CAGC ACAAAGAG GAGAGAC T T AGT GCCT CAC CT CAG G AGC T A 2562 

I I Ml I III I I I I I I I I I I I I I I III III I 

Db 2467 T T T GAGT CAAT GAT AGAAT AT GAAAAT AAG GAAAAACT CAGT G C T T T G C CAC C T GAG GGA 252 6 

Qy 2563 GGAAAG C CAT AT T T AGAGT CTTTT C AGC C CAAT T T ACAT AGT ACAAAAGAT G C TGCA 2619 



Db 2 527 G GAAAGC CAT AT T T GGAAT C TT T T AAG C T C AGTT T AGAT AACACAAAAG AT AC C CT GTT A 258 6 

Qy 262 0 T CT AAT GAC AT T C C AAC AT T GAC C AAAAAG GAG AAAAT T T C T T T G C AAAT G GAAGAGT T T 2679 

II M I I II MINIMI M II I II I II I I I I II I II I! M I II I I III I 

Db 2587 C C T GAT GAAGT T T CAAC AT T GAGCAAAAAGG AGAAAAT T C CTT T GC AGAT GGAG GAG C T C 2646 

Qy 2 68 0 AAT ACT G C AAT T TAT T CAAAT GAT GACT TAC T T T C T T C TAAGGAAG AC AAAAT AAAAGAA 2739 

I M II II I II I I II I II I II II II I I II II II I I II I I II I I MM MM 
Db 2647 AGTACTGCAGTTTATT CAAAT GAT GACT TATTTATTT CTAAGGAAGCAC AGAT AAGAGAA 2706 

Qy 274 0 AGT GAAAC AT T T T C AGAT T CAT CT C C GAT T GAG AT AAT AGAT GAAT T T C C C AC GT T T GT C 27 99 

I I I I I I I I M I II I I II I I I II II II II I II M I II I I I II M II M II 
Db 2707 AC T GAAACGT T TT C AGAT T CAT C T C CAAT T GAAATTAT AGAT GAGT T C C CT ACAT T GAT C 2766 

Qy 28 00 AGT GCTAAAGAT GATT C T C CT AAAT T AGC CAAGGAGT AC ACT GAT C T AGAAGT AT C C 2856 

Ml Mill II I I II I II II II II I I II III II II || | | | | | | || | || | | 
Db 2 7 67 AGT T C TAAAACT GAT T CAT T T T CTAAAT T AGC C AGGGAAT AT ACT GAC C T AGAAGT AT C C 282 6 

Qy 2857 GACAAAAGT GAAAT T GC T AATAT C CAAAGC G G G G CAGAT T CAT T GC C T T G CTT AGAAT T G 2916 

I I M M I I I II I II I I II I I M I I I I I III I I I I I I I I I I I 

Db 2 827 C ACAAAAGT GAAAT T GCTAAT GC C C C GG AT GGAGCT G GGT CAT T G C C T T G C ACAGAATT G 2886 

Qy 2917 CCCT GT GACCTTT CTT T CAAGAATATATAT CCTAAAGAT GAAG TACATGTTTCA 2970 

Ml I II II I I I I I I I I I I || I I I I || | M I II II I I I I M I 

Db 2887 CC C CAT GAC CT T T C T T T GAAGAACATACAAC C CAAAGT T GAAGAGAAAAT C AGT T T CT C A 294 6 

Qy 2 971 GAT GAAT T CT C C GAAAAT AG GT C C AGT GT AT CTAAGGCAT C CAT AT C GC C T T CAAAT GT C 3030 

I M I I I I I I I I II I I II I I II I II I I I I II | | | | || || | | 

Db 2947 GAT GACT T T T C TAAAAAT GG GT CT G CT AC AT CAAAGGT GC T CT T ATT GC C T C CAGAT GT T 3006 

Qy 3031 TCTGCTTT GGAAC CT CAGACAGAAAT GGG C AGCAT AGT TAAAT C CAAAT C ACT TAC GAAA 3090 

I M II I II I I MM I I II II I I I I II I I I I || | | || || III MM 
Db 3007 TCTGCTTTGGC C AC T CAAGC AGAGAT AGAGAGCAT AGT T AAAC C CAAAGT T C T T GT GAAA 3066 

Qy 3091 GAAGC AGAGAAAAAACT T CCT T CT GAC ACAGAGAAAGAGGAC AGAT CCCT GT CAGC T GT A 3150 

I M II II II I I I I I I II I II II II Mill II I II II II I I II I I II III II 
Db 3067 GAAGCT GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAG GACAGAT C AC CAT C T GCT AT A 3126 

Qy 3151 T T GT CAG CAG AGC T GAGT AAAACT T CAGT T GT T G AC CT C CT C TACT G GAGAGAC AT T AAG 3210 

M I I I I I M M I I M I II I II II II II II I I I II I I I II I I I I I I II I II I II I I I II 
Db 3127 T T TT CAGCAGAGC T GAGTAAAAC T T CAGT T GT T GAC CT CCT GT AC T GGAGAGAC AT T AAG 3186 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

M I M I I I M I M M M II I II II II I II I I I II I II II I II II II II I II I II I 
Db 3187 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3246 

Qy 32 71 ATT GT CAGT GTAACGGCCTACATTGC CTT GGC CCT GCTCTCGGTGACTATCAGCTTTAGG 3330 

Mill II I I I II II II I II II II II II I I II II II II I I I I I I II II II II I M I 
Db 3247 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3306 

Qy 3331 AT AT AT AAG G GCGT GAT C C AGGCT AT C C AGAAAT CAGAT GAAG G C C AC C CAT T CAGGG C A 3390 

IMM Mill I M II I II II II II II I I I I II II II I II MINIM 

Db 3307 AT AT AC AAG GGT GT GAT C CAAG C TAT C C AGAAAT CAGAT GAAG GC CACC C AT T CAG GG C A 3366 

Qy 3391 TAT T T AGAAT CT GAAGT T G C TAT AT CAGAGGAAT T GGT T C AGAAAT AC AGT AAT T CT G C T 3450 

Ml I M I II || || I I I M II I I II INN I II I II II II I I II I I I II II I I I II 



Db 



3367 TAT CT GGAAT CT GAAGT TGCTAT AT CT GAGGAGTT GGTT CAGAAGTACAGTAATTCT GCT 342 6 



Qy 




Db 



Qy 




Db 



Db 



Qy 




Qy 



3631 GAAC G GC AT C AGGT GCAGAT AGAT CAT TAT CT AG GACT T GCAAACAAGAGT GT T AAGGAT 3690 




Db 



3 607 GAAC G GC AT CAGGC ACAGAT AGAT CAT TAT C TAG GACT T GCAAAT AAGAAT GT TAAAGAT 3666 



Qy 




Db 



RESULT 7 
US-09-789-386-1 

; Sequence 1, Application US/09789386 

; Patent No. US2 0020010324A1 

; GENERAL INFORMATION: 

; APPLICANT: MICHALOVICH, DAVID 

; APPLICANT: PRINJHA, RABINDER KUMAR 

TITLE OF INVENTION: NOVEL COMPOUNDS 
; FILE REFERENCE: GP-30165-C1 

CURRENT APPLICATION NUMBER: US/09/78 9, 386 
; CURRENT FILING DATE: 2001-02-21 

PRIOR APPLICATION NUMBER: U.K. 9916898.1 
; PRIOR FILING DATE: 1999-07-19 
; PRIOR APPLICATION NUMBER: U.K. 9816024.5 
; PRIOR FILING DATE: 1998-07-22 
; PRIOR APPLICATION NUMBER: US 09/359,208 
; PRIOR FILING DATE: 1999-07-22 
; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 1 
; LENGTH: 3579 
TYPE: DNA 

ORGANISM: HOMO SAPIENS 
US-09-789-386-1 

Query Match 61.2%; Score 2289.2; DB 9; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19; 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II Ill 

Db 1 AT GGAAGAC C T GGAC CAGT CTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 



Qy 313 CCGCCCGCCTT CAAGT AC C AGT T C GT GAC GG AGC C C GAGGAC GAGGAGGAC GAG GAGG AG 372 



Db 58 C AG C C C G C GT T C AAGT AC C AGT T C GT GAGGGAGC C C GAGGAC GAG GAG GAAGAAGAG 114 

QV 373 GAGGAGGACGAGGAGGAGGACGACGAGGACCTAGAGGAACT GGAGGT GCT GGAGAGGAAG 432 

M I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | M I I 
Db 115 GAGGAG GAAGAG GAGGAC GAGGAC GAAGAC CT GGAGGAGC T GGAG GT GCT GGAGAG GAAG 17 4 

QV 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 48 6 

Ml: I I I I I I I I I I | | I I II I I I I I I II I | | | I || I | | | M III 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I M M I I I I I MM M I II I I I I I I II M I I M II I II I I II I I II I 

Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 
Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I I M II I I I I I I M I I I I I II M I MIMI I I I I II 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

M I M I I I II II I I I I I I I I II II I I II I I I II II I I M M I M II II II I II 

Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 
Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M M M I I MM II II I II II II III III I II I II II III III 

Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I II I I I M I I I M I I I I I I II I I I I I I M II II II I 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 

M II I II I I II II M I I I II II I II I I I I I || I I M M II I I I II II I II II I I I II 
Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 808 GT GAT AC CCTCCTCT GCAGAAAAAAT TAT GGAT T T GAT GGAGCAGC C AGGTAACACT GT T 8 67 

M M I II II I I II II II I I I I I MINI I II I I I I I I II II II I II 

Db 595 GT GAT AC GCTCCTCT GCAGAAAA TAT GGACT T GAAGGAGCAGC C AGGT AAC AC TAT T 651 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

I M M I I M II II II II I I I I I II I II I II I II I I I I I I I || I M I II I I I II I I I II 
Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M II II M I II II I I I II II I I I I I II II II I I II I I II I I I I I || Ml || 
Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 988 GT GT CAT C C T CAGAAGGAACAATT GAAGAAACT T T AAAT GAAGC T T C T AAAGAGT T GC C A 1047 

M I I M I I I I M I I II II MIMI I I I I || M I II I I I I II II I II 
Db 772 GT AT T AC C C ACT GAAGGAAC AC TT C AAGAAAAT GT CAGT GAAGCT T CT AAAGAG GT CT C A 831 

Qy 1048 GAGAGGGCAACAAAT CCATT T GTAAAT AGAGATTTAGCAGAATT TT CAGAATTAGAATAT 1107 

MM Mill II M I I II II I I II I II I I II I I II II II II II II I II I I 
Db 832 GAG AAGGC AAAAAC T CT AC T C AT AGAT AG AGAT T T AAC AGAGT T T T C AG AAT T AGAAT AC 891 

Qy 1108 T CAGAAAT GGGAT C AT CT T T T AAAGG CT C C C CAAAAGGAGAGT C AGC C AT AT T AGT AGAA 1167 

M M II I II || I || M I II I I III II II II I III II III II II I II I I 



Db 8 92 T C AGAAAT GGGAT CAT C GT T C AGT GT C T C T C CAAAAG C AGAAT C T G C C GT AATAGT AGC A 951 

Qy 1168 AACACTAAGGAAGAAGTAAT T GT GAGGAGTAAA GAC AAAGAG GAT T T AGT T T GT AGT 1224 

M Ml I M I i i M I I I I I II I I I I I I 

Db 952 AAT C C T AGG G AAGAAAT AAT C GT GAAAAAT AAAG AT G AAGAAGAG AAGT T AGT T AGT AAT 1011 

Qy 1225 G C AGC C C T T CACAGT C CACAAGAAT C AC CT GT GGGTAAAGAAGAC 1269 

I I I I I I I I I I I I I I I I I I I I I II I I I II I I I 

Db 1012 AACAT C CT T CAT AAT C AACAAGAGT TAC C T AC AGCT CT T AC T AAAT T GGT T AAAGAGGAT 1071 

Qy 1270 AG AGT T GT GT C T C CAGAAAAGACAAT G GAC AT T T T T AAT GAAAT GC AGAT GT C AGTAGT A 132 9 

I M I I I I I M I I II I I I III I I I I I II I I I I I I I I I I I I II I I 

Db 1072 GAAGT T GT GT CT T C AGAAAAAGC AAAAGAC AGT T TT AAT GAAAAGAGAGT T G C AGT G GAA 1131 

Qy 1330 G C AC C T GT GAGG GAAGAGT AT GC AGAC T T TAAGC CAT T T GAAC AAGC AT GG GAAGT GAAA 1389 

II Ml I I I I I I I II I I I I I I I I I I I M I II I I I I I I II MINIMUM! 
Db 1132 GC T C CT AT GAGG GAGGAAT AT GC AGACT T CAAAC CAT T T GAG C GAGT AT G G GAAGT GAAA 1191 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I I I I I I I I I I I I II I M I I I I I I I I I || || 

Db 1192 GATA GTAAGGAAGAT AGT GAT AT GTT GGCT GCT GGAGGTAAAAT CGAGAGCAACTT G 1248 

Qy 1438 GAAAGT AAAGT G GAC AGAAAAT G CT T G GAAGAT AG C C T GGAGCAAAAAAGT CTT GG GAAG 1497 

I M I I M I I M I I I I I I I I I I I I I I I I I I II I I I I I I I II IN I If 
Db 1249 GAAAGT AAAGT GGATAAAAAAT GT T T T G C AGAT AGC C TT GAG CAAAC T AAT C AC GAAAAA 1308 

Qy 14 98 GAT AGT GAAGGCAGAAAT GAG GAT GCTTCTTTCCC CAGT AC C CC AGAAC CT GT GAAGGAC 1557 

I M I I I II I I I I I I I I I I I I I I I I I I I I I | M | | | | | | | | | | | | M | | 
Db 1309 GAT AGT GAGAGT AGT AAT GAT GAT AC TTCTTTCCC C AGT AC GC C AGAAG GT AT AAAGGAT 1368 

Qy 1558 AGC T C CAGAG CAT AT AT TAC C T GT GCT T C CT T TAC C T CAGCAACCGAAAGCACCACA 1614 

I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I || || 

Db 1369 C GT C C AG GAG CAT AT AT C AC AT GTGCTCCCTT TAAC C C AGC AGCAACT GAGAGC AT T G C A 1428 

Qy 1615 G CAAACAC TTTCCCTTT GT T AGAAGAT CAT ACTT CAGAAAATAAAAC AGAT GAAAAAAAA 1674 

I I I I I I III I I I I I I I I I I I I I I I I I I I I || | | | | | M I II I I I I I I I I I I I I 
Db 1429 ACAAAC AT TTTTCCTTTGT T AGGAGAT CCT ACTT CAGAAAATAAGAC C GAT GAAAAAAAA 148 8 

Qy 1675 AT AGAAGAAAGGAAG G C C CAAAT T AT AAC AGAGAAG ACT AG C C C CAAAAC GT CAAAT 1731 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I M I I I | I I I | 

Db 14 89 ATAGAAGAAAAGAAGGCCCAAATAGTAACAGAGAAGAATACTAGCACCAAAACATCAAAC 154 8 

Qy 1732 CCTTTCCTT GT AG C AGT AC AGGAT T C T GAG G CAGAT TAT GT T ACAAC AGAT AC CT T AT C A 1791 

I I I I I I I I I I I M I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I I I I III II 
Db 1549 CCTTT T CTT GT AGCAGCACAGGATT CT GAGACAGATT AT GT CACAACAGATAATTTAACA 1608 

Qy 17 92 AAG GT GACT GAG GC AGCAGT GT CAAAC AT G C CT GAAGGT CT GAC GC CAGAT T T AGT T C AG 1851 

M II I I I I I I I I I II III I I I I M | | | || I I I II I I I I I I I I II I I || | I III 
Db 1609 AAGGT GACT GAGGAAGT C GT GG CAAAC AT G C C T GAAGGC CT GAC T C CAGAT T T AGT AC AG 1668 

Qy 18 52 GAAG CAT GT GAAAGT GAACT GAAT GAAG C C ACAGGT ACAAAGAT T GCT TAT GAAAC AAAA 1911 

I I I I I I I I I I I I I I I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1669 GAAGC AT GT GAAAGT GAAT T GAAT GAAGT TAC T G GTACAAAGAT T G C T TAT GAAACAAAA 1728 

Qy 1912 GT GG AC T T GGT C CAAAC AT C AGAAGCT AT ACAAGAAT CACT T TAC C C C AC AGC AC AGC T T 1971 

I I I I I I I I I I I I I I I I I I I I I I I III Mill Mill II M I I I IE I I I M I 
Db 172 9 AT GGACT T G GT T CAAACAT C AGAAGT TAT GCAAGAGT CACT CT AT CCT GC AG C AC AG CTT 1788 



Qy 


1972 


T G C C CAT CAT T T GAGGAAG CT GAAG CAAC T C C GT C AC CAGT T T T GC CT GAT AT T GTT AT G 2031 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 I I I I I | | | || | | | | | | | | | | | || MINIMI 
T G C C CAT CAT T T GAAGAGT C AGAAGC T AC T C C T T C AC CAGT T T T GC CT GACAT T GT TAT G 18 48 


Db 


1789 


Qy 


2032 


GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

M 1 1 1 1 1 II 1 1 MM 1 1 1 1 1 1 1 1 1 1 1 1 | MINIM! 1 

GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 1908 


Db 


1849 


Qy 


2092 


T C C C CACT GGAAG CAC C T C C T C C AGT T AGT T AT GAC AGT AT AAAGCT T GAG CC T GAAAAC 2151 

M 1 M 1 1 1 1 1 M M 1 1 1 1 1 1 MINI II Mill 1 1 1 II 1 1 II II 1 1 1 

T CAC CAT T AGAAG C TT C T T CAGT T AAT TAT GAAAG C AT AAAACAT GAG CC T GAAAAC 1965 


Db 


1909 


Qy 


2152 


CC C C CACCATAT GAAGAAGC CAT GAAT GT AGCACT AAAAG C T T T G GGAACAAAG GAA 22 08 

1 1 1 1 1 M 1 M II 1 II 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 M M 1 MM II II 1 1 1 

CCC C CACCATAT GAAGAGGCCAT GAGT GT AT CACTAAAAAAAGTAT CAGGAATAAAGGAA 2 025 


Db 


1966 


Qy 


2209 


GGAATAAAAGAGC CT GAAAGTTTTAAT GCAGCTGTT CAGGAAACAGAAGCT CCTT ATAT A 2268 

1 M 1 1 Mill 1 1 1 II 1 II 1 II 1 1 II 1 1 1 1 M II II 1 1 1 1 1 1 1 II 1 M 

GAAAT TAAAGAGCCT GAAAAT ATTAAT GCAGCT CT T CAAGAAACAGAAGCT C CTTATAT A 2 0 85 


Db 


2026 


Qy 


2269 


T C CAT T GC GT GT GATT T AAT T AAAGAAACAAAGC T CT C CAC T GAGC CAAGT C CAGAT T T C 2 32 8 

M 1 1 1 1 1 1 1 1 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II MM III III 1 II 1 M 

T CT AT T GC AT GT GAT T TAAT TAAAGAAAC AAAG CTTTCTGCT GAAC C AGCT C C GGAT T T C 2145 


Db 


2086 


Qy 


2329 


T CT AATT ATT C AGAAAT AG CAAAAT T C GAGAAGT C GGT G C C C GAACAC GCT GAGC TAGT G 2388 

1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 

T CT GAT T ATT C AGAAAT GG CAAAAGT T GAAC AG C CAGT G C CT GAT CAT T CT GAGCT AGT T 22 05 


Db 


2146 


Qy 


2389 


GAG GAT T C CT CAC C T GAAT C T GAAC CAGT T GACT T AT T TAGT GAT GAT T C GAT T C C T GAA 24 4 8 

M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 II 1 1 1 II 1 II 1 1 1 1 1 II 1 II 1 II 1 II II 1 1 II 1 

GAAG AT T C CT C AC CT GAT T CT GAAC C AGT T GACT TAT T TAGT GAT GAT T CAAT AC CT GAC 22 65 


Db 


2206 


Qy 


2449 


GT C C CACAAAC ACAAGAG GAG G CT GT GAT GCT CAT GAAGGAGAGT CT CACT GA A 2 5 02 

1 1 M 1 1 1 M II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 1 1 1 1 1 1 | M 1 

GT T C C ACAAAAAC AAGAT GAAACT GT GAT GCT T GT GAAAGAAAGT C T CACT GAGACT T C A 2325 


Db 


2266 


Qy 


2503 


GT GT CT GAGAC AGT AGC C CAGC ACAAAGAGGAGAGACT T AGT GC C T C AC CT CAGGAGC T A 2 562 

1 1 Ml 1 III 1 1 1 1 1 M 1 1 1 1 II 1 III III 1 

T T T GAGT CAAT GAT AGAAT AT GAAAAT AAG GAAAAAC T CAGT GCT T T GC C AC CT GAGG GA 2385 


Db 


2326 


Qy 


2563 


GGAAAG C CAT AT T T AGAGT CT T T T CAGC C CAAT T TAC AT AGTACAAAAGAT G C TGCA 2 619 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 II 1 1 1 II II 1 1 1 II 1 1 1 1 

G GAAAGC C AT AT T T G GAAT C T T TTAAG CT C AGT T T AGATAACACAAAAGAT AC C C TGT T A 2445 


Db 


2386 


Qy 


2620 


T CT AAT GACAT T C CAAC AT T GAC C AAAAAG GAGAAAAT T T CT T T G C AAAT G GAAGAGT T T 2679 

II 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 1 1 II 1 1 1 II 1 1 III 1 
C CT GAT GAAGTT T CAAC AT T GAGCAAAAAGGAGAAAAT T C CT T T GCAGAT GGAGGAGC T C 2 505 


Db 


2446 






AA.1 AU 1 CjCAA 1 1 1 A 1 1 C AAAT GAT GAC T T ACT T T CT T CT AAG GAAGAC AAAAT AAAAGAA 2739 

1 1 M 1 II 1 1 1 1 II 1 1 II 1 1 1 1 1 1 II II 1 1 1 II II 1 1 M 1 1 1 1 1 1 1 1 1 II 1 

AGT AC T G CAGT T TAT T CAAAT GAT GACT TAT T TAT T T CTAAGGAAG CAC AGATAAGAGAA 2565 


Db 


2506 


Qy 


2740 


AGT G AAAC AT T T T CAGAT T CAT CT C C GAT T G AGAT AAT AG AT GAAT T T C C CAC GT T T GT C 2799 

1 M 1 1 1 1 1 1 1 1 1 1 II 1 II II II II 1 1 1 1 1 II 1 1 II 1 1 II II II M II II 

ACT GAAAC GTTTT CAGATT CAT CT CCAATT GAAAT TAT AGAT GAGT T C CCT ACATT GAT C 2625 


Db 


2566 



Qy 2800 AGT GCTAAAGAT GATT C T C CT AAAT T AGC C AAG G AGT AC AC T GAT CT AGAAGT AT C C 2 856 

I M M I I I I I I I I I I I I II I I II I I I I III II I I I I I II I I I I I I I I I I 
Db 2626 AGT T CT AAAAC T GATT CAT T T T CT AAAT T AGC C AGG GAAT AT ACT GAC CT AGAAGT AT C C 2685 

Qy 2857 GAC AAAAGT GAAAT T GCT AAT AT CCAAAGC GG G G C AGAT T CAT T G C C T T G CT TAGAAT T G 2916 

I I I I I I I I I I I I I I I I I Mill I I I I I I I II I I I I I II I I I 

Db 268 6 CAC AAAAGT GAAAT T GCTAAT G C CC C G GAT GGAG CT G G GT CAT T G C C T T G C ACAGAAT T G 274 5 

Qy 2 917 CCCTGTGACCTTTCTTTCAAGAATATATATCCTAAAGATGAAG T AC AT GT T T C A 297 0 

IN I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 27 46 C C C CAT GAC CT T T C T T T GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 2 8 05 

Qy 2971 GAT GAATT CT C C GAAAAT AG GT C CAGT GT AT CT AAGGCAT C CAT AT C GC C TT CAAAT GT C 3030 

M I I I I I I I I I I I I I I I I I I | I | | M I I I I I I I I I I I I I I 

Db 2806 GAT GACT TT T CTAAAAAT G G GT CT G CT AC AT CAAAGGT GCT CT T AT T GC C T C C AGAT GT T 2 8 65 

Qy 3031 T CT G C T T T GGAAC CT C AGACAGAAAT GG G C AGC AT AGT T AAAT C CAAAT C AC TT AC GAAA 3090 

I I I I M I I II I I I I I I I I I I I I I I I I I I I I I I I I || I 1 Ml | | | I 
Db 2 8 66 TCTGCTTTGGC CAC T C AAGCAGAGAT AGAGAG C AT AGT TAAAC CCAAAGT T CTT GT GAAA 2925 

Qy 3091 GAAGCAGAGAAAAAACTTCCTTCTGACACAGAGAAAGAGGACAGATCCCTGTCAGCTGTA 3150 

I I I M I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II III M 
Db 2926 GAAG CT GAGAAAAAACT T C CT T C C GAT ACAGAAAAAGAG GAC AGAT CAC CAT CT GCT AT A 2985 

Qy 3151 TT GT C AGC AGAGC T GAGTAAAAC T T CAGT T GT T GAC CT C CT C T AC T GGAGAGAC AT T AAG 3210 

M I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I 
Db 2 98 6 TT TT C AGC AGAG CT GAGTAAAACT T CAGT T GT T GAC CT C CT GT AC T GGAGAGAC AT T AAG 3045 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 32 70 

I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I M I I 
Db 3046 AAGACT GGAGT GGT GT T T GGT G C C AG C CT AT T C CT G CT G C T T T C ATT GAC AGT ATT C AGC 3105 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I M I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I II I I I I I II I I II I 
Db 3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

Qy 3331 ATATATAAGGGCGT GATCCAGGCTAT C CAGAAAT CAGATGAAGGC CAC CCATTCAGGGCA 3390 

I I I I I I I I I I I I I I M I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 

Db 3166 AT AT ACAAGG GT GT GAT C CAAGC T AT C CAGAAAT CAGAT GAAGGC CAC C CAT T C AGG GC A 3225 

Qy 3391 TAT T TAGAAT C T GAAGT T G CT AT AT CAGAG GAAT T GGT T CAGAAAT ACAGTAAT T C T G CT 3450 

M I I I I I I I I M I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 3226 TAT CT G GAAT CT GAAGT T GCT AT AT CT GAGGAGT T G GT TC AGAAGT AC AGTAAT T CT GC T 32 85 

Qy 34 51 C T T G GT CAT GT GAAC AG C ACAATAAAAGAACT G AGG C GGC T TT T CT T AGT T GAT GATT T A 3510 

I I I I I I I I I I I I I M I I I I I I I M I I I I I Mill I I II I I I II II I I I I I I I I I 
Db 3286 CTTGGTCATGTGAACTGCACGATAAAGGAACTCAGGCGCCTCTTCTTAGTTGATGATTTA 3345 

Qy 3511 GTT GATT CCCTGAAGTTTGCAGTGTT GAT GTGGGTGTTTACTTATGTT GGT GCCTTGTTC 3570 

I I I I I M I II II I I II II I I II I I I I I II I I I I I Mill II II I I I II I I II I I I I 
Db 3346 GTT GATT CTCTGAAGTTTGCAGT GTT GAT GTGGGTATTTACCTAT GTT GGT GCCTTGTTT 34 05 

Qy 3571 AAT G GT CT GAC ACT ACT GAT T T TAG C T C T GAT C T CACT CTT CAGT AT T C CT GT T AT TT AT 3630 

I I I I I I I I I M I M I I M I II I I II II I II II II II II I I I I II I I I I I I I I I I II 
Db 3406 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 3465 

Qy 3631 GAAC GG CAT CAG GT GC AG AT AGAT CAT T AT CT AG GAC T T GCAAACAAG AGT GT TAAGGAT 3690 





Db 



3466 GAAC GG C AT CAG GC G C AGAT AGAT CAT TAT C T AGGAC T T GCAAATAAGAAT GT TAAAGAT 352 



Qy 




Db 
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; PRIOR APPLICATION NUMBER: US 09/314,161 
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TYPE: DNA 
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Query Match 61.2%; Score 2289.2; DB 9; Length 3579; 
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Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19; 

Qy 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I > I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | M I I 
Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

QY 313 CCGCCCGCCTT C AAGT AC C AGT T C GT GAC GGAGC C C GAGGAC GAG GAGGAC GAGGAGGAG 372 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I II I I I I I M | | | | || || IN 
Db 58 CAG CCCGCGTT CAAGT AC C AGT T C GT GAGG GAGC C C GAGGAC GAGGAG GAAGAAGAG 114 



Qy 



373 GAGGAG GAC GAGGAG GAGGAC GAC GAGGAC C T AGAG GAACT GGAGGT GCT GGAGAGGAAG 432 
I I I I I N I I I I I I I I I I | | M I I I I I I I I I I I I I I I I | | | | | | | | | | || || | | | 



Db 115 GAGGAG GAAGAGGAG GAC GAG GAC GAAGAC C T G GAG GAG CT G GAG GT G C T GGAGAGGAAG 174 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 486 

I I I I I I I I M I I I M I I I I I I I I I I I I I I I I I I I I I | I I I I I I I I 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I I M I I I I I I I I I I I I I II II I II I I I II II I I I I I I I I I M I II 
Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

I I I M I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

M I I I I II I I II I I II I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I M I I 
Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I I I I ! I I I I I I II I I I I I II II I I I I I I I I I I I I I I Ml Ml 
Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I M I II I I I I I I I I I I I I I I I I I || I II I II II 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 8 07 

I I I I I I I I I M I I I I I I I I I I II I I I I I I I I I II I I I I I I I I I II I I I I I I I I I II I 
Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 8 08 GT GAT AC CCTCCTCTG CAGAAAAAAT TAT GGAT T T GAT G GAG CAGC C AGGT AAC ACT GT T 8 67 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I II I I I I I I I I I I II I I I II 
Db 595 GTGATACGCTCCTCTGCAGAAAA TATGGACTTGAAGGAGCAGCCAGGTAACACTATT 651 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I I I I I I I II I I I I I II I I I II I II I I I I I I I I I I I I M I I I I I I I I II I I I I I 
Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M I I I I I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II III II 
Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 98 8 GT GT CAT C C T C AGAAG GAACAAT T GAAGAAACT T T AAAT GAAG CTT C T AAAGAGT T G C C A 1047 

I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I II 
Db 772 GT ATT AC CCACTGAAGGAACACT T CAAGAAAAT GTCAGT GAAGCTT CTAAAGAGGT CT CA 831 

Qy 104 8 GAGAGG GCAACAAAT C C ATT T GT AAAT AGAGAT T TAG C AGAAT T T T CAGAAT T AGAAT AT 1107 

I I I I I I I I I M M I I II I II II I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 832 GAGAAGGCAAAAACT CT ACT CATAGAT AGAGATTTAACAGAGTTTT CAGAATTAGAATAC 891 

Qy 1108 T C AGAAAT GGGAT CAT CTTTTAAAGGCT CC C CAAAAGGAGAGT CAGC CATATTAGTAGAA 1167 

I M I I I I I I I I I I I I I I II I I Ml I I I | | I I I II II III II I I I I I I I 

Db 892 T C AGAAAT GG GAT CAT C GTT CAGT GT C T CT C CAAAAGCAGAAT CT GC C GT AAT AGT AG C A 951 

Qy 1168 AAC AC T AAG GAAGAAGT AAT T GT GAGG AGT AAA GACAAAGAGGATTTAGT T T GTAGT 1224 

II I I I I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I 

Db 952 AAT C CT AGGGAAGAAAT AAT C GT GAAAAATAAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1011 



Qy 1225 GCAGC C CT T C ACAGT CCACAAGAAT CAC CT GTGGGTAAAGAAGAC 12 69 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1012 AAC AT C CT T C ATAAT CAACAAGAGT T AC CT AC AGC T CT T ACTAAAT T GGT T AAAGAG GAT 1071 

Qy 127 0 AGAGT T GT GT CT C CAGAAAAGAC AAT GGAC AT T T T TAAT GAAAT GCAGAT GT C AGT AGT A 132 9 

I I I I I I I I I I II I I I I I III II I I I II I I I I I I I I I I MM I I 

Db 1072 GAAGT T GT GT C TT CAGAAAAAGCAAAAGAC AGT T T TAAT GAAAAGAGAGT T GC AGT G GAA 1131 

Qy 1330 GCACCT GT GAGGGAAGAGTATGCAGACTTTAAGCCATTTGAACAAGCAT GGGAAGTGAAA 1389 

M Ml II I I I I I II I I II I I II I I I II I I I I II I I I II M II I I I I I I M I 
Db 1132 GCT CCTAT GAGGGAGGAAT AT GCAGACTT CAAACCATT T GAGC GAGTAT GGGAAGTGAAA 1191 

Qy 139 0 GAT ACT TAT GAGG GAAGT AG GGAT GT GCT GGCT GCTAGAGCT AATGTG 1437 

MM Mill I I I II I II II I I I I II I M I M M 

Db 1192 GATA GT AAG GAAGAT AGT GAT AT GTTGGCTGCTG GAGGTAAAAT C GAGAGC AAC T T G 124 8 

Qy 1438 GAAAGT AAAGT G GACAGAAAAT GC T T G GAAG ATAGCCT GGAGCAAAAAAGT CT T G GGAAG 1497 

I I I I I I I I I I I M I I I II I I I II I I I II I II I I II II II I III I M 
Db 124 9 GAAAGT AAAGT GGAT AAAAAAT GT T TT GCAGAT AG C CT T GAG CAAACT AAT CAC GAAAAA 1308 

Qy 1498 GAT AGT GAAGG C AGAAAT GAG GAT GCT TCTTTCCC C AGT ACC C CAGAAC CT GT GAAGGAC 1557 

I I M II II I II Mill Ml II II II I II II II I M II I I II I I I II M 
Db 1309 GATAGTGAGAGTAGTAATGATGATACTT CTTT CCCCAGTACGCCAGAAGGTATAAAGGAT 1368 

Qy 1558 AG CT C C AGAGC AT AT AT T AC CT GT G CT T C CTT T AC CT CAGCAACCGAAAGCACCACA 1614 

I I M I M I M M II I I II I I I II I I I I I I M M I II I I II II 

Db 1369 C GT C C AG GAGCAT AT AT CAC AT GT GCT C C CTT TAACC CAGCAGCAACT GAGAGC AT T GC A 142 8 

Qy 1615 GCAAACACT T T C C C T TT GT T AGAAGAT CAT AC T T CAGAAAAT AAAAC AGAT GAAAAAAAA 1674 

I I I M I III I I I I I II II I I I I II M I I I II I II I I M I II II II II I I I M I 
Db 1429 ACAAAC AT T T TT C CT TT GTT AG GAGAT C C T ACT T CAGAAAAT AAG AC CG AT GAAAAAAAA 1488 

Qy 1675 AT AGAAGAAAGGAAG GC C C AAAT T AT AAC AGAGAAG ACT AGC C C CAAAAC GT CAAAT 1731 

I I I I I I I I I I II I M I I I II II I I I I I I II I M MUM II I II II Mill 

Db 14 8 9 AT AGAAGAAAAGAAGGCC CAAAT AGT AACAGAGAAGAAT ACT AGCAC CAAAACAT CAAAC 154 8 

Qy 1732 CCTTTCCTT GT AGC AGT AC AG GAT T C T GAGGC AGAT TAT GT T ACAAC AGAT AC CT T AT C A 1791 

I I I I I I I I I II I I I I II I I I I I I I II I I M I M I II I I I I II II II I I III II 
Db 154 9 CCTTTTCTT GT AG CAGC AC AGGAT T CT GAGAC AGAT TAT GT C ACAAC AGAT AAT T T AAC A 1608 

Qy 1792 AAGGTGACTGAGGCAGCAGTGT CAAAC AT GCCTGAAGGTCTGACGCCAGATTTAGTTCAG 1851 

I I I I I I I M I I II M III I I I I II M I M I M I I II I II I I II II II II I III 
Db 1609 AAGGT G ACT GAG GAAGT C GT G G CAAAC AT GC C T GAAGGC CT GACT C C AGAT T T AGT ACAG 1668 

Qy 18 52 GAAGCAT GT GAAAGT GAACT GAAT GAAGCCACAGGT ACAAAGATTGCT TAT GAAACAAAA 1911 

I I I I I I I I I M II II II I II M I I I I I II I M I I II I M II I II I II M II I I I II 
Db 1669 GAAGCAT GT GAAAGT GAAT T GAAT GAAGT T AC T GGTACAAAGATT GCT TAT GAAACAAAA 172 8 

Qy 1912 GTGGACTTGGTCCAAACATCAGAAGCTATACAAGAATCACTTTACCCCACAGCACAGCTT 1971 

M I I II I II I I I I II I II I II I I III Mill Mill II M II II I I I I I M 
Db 172 9 ATGGACTTGGTTCAAACATCAGAAGTTATGCAAGAGTCACTCTATCCTGCAGCACAGCTT 178 8 

Qy 1972 T G C C CAT CAT T T GAG GAAGCT GAAG C AACT C C GT CAC C AGT T T T GC CT GAT AT T GT TAT G 2031 

I I I I I I I I I M M I II I Mill I I I I I II I II II I I II I II II I II I I I II I I 
Db 17 8 9 T GC C CAT CAT T T GAAGAGT C AGAAGC T AC T C C TT C AC CAGT T T T GC C T GAC AT T GT TAT G 184 8 



Qy 2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 

I I I I I I I I I I I I I I M I I I I I I I I I I I I M I I I I I I I I I I I I I I I | 

Db 184 9 GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 1908 

Qy 2092 TCCCCACTGGAAGCACCTCCTCCAGTTAGTTATGACAGTATAAAGCTTGAGCCTGAAAAC 2151 

I I Ml I I I I I II M I I I I I I II I I I I II I I I II I I I I I I I I I I | | | | 
Db 19 09 T C AC CAT T AGAAG CT T C T T C AGT T AAT TAT G AAAG C AT AAAAC AT GAG C CT G AAAAC 1965 

Qy 2152 C C C C C AC CAT AT GAAG AAG C CAT G AAT GT AG C AC T AAAAG C T T T G G G AAC AAAG GAA 2208 

I I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 1966 C C C C C AC CAT AT GAAG AGG C CAT GAGT GT AT C ACT AAAAAAAGT AT C AG G AAT AAAG GAA 2025 

Qy 22 09 GGAATAAAAGAGCCTGAAAGTTTTAATGCAGCTGTTCAGGAAACAGAAGCTCCTTATATA 2268 

I Ml I I I M I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | M | | | M I I I I 
Db 2026 GAAAT T AAAGAG C C T GAAAAT AT T AAT GC AGC T C T T CAAGAAAC AGAAG CT C C T TAT AT A 2085 

Qy 22 69 TCCATTGCGTGTGATTTAATTAAAGAAACAAAGCTCTCCACTGAGCCAAGTCCAGATTTC 2 328 

M I I M I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I III III I I I I | | 
Db 2086 TCTATTGCATGTGATTTAATTAAAGAAACAAAGCTTTCTGCTGAACCAGCTCCGGATTTC 2145 

Qy 2329 TCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGTG 2388 

I I I I I I I I I I I I I I I I I I I I I I I I I | | | M | | | | | | | | || || | | | | | 
Db 2146 TCTGATTATTCAGAAATGGCAAAAGTTGAACAGCCAGTGCCTGATCATTCTGAGCTAGTT 2205 

Qy 23 8 9 GAG GAT T C C T C AC C T GAAT C T GAAC C AG T T G AC T TAT T T AGT GAT GAT T C GAT T C C T GAA 2448 

M I I I I I I I I I I M II I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I || MIM 
Db 2206 GAAG AT T C C T C AC C T GAT T C T GAAC C AGT T G AC T T AT T T AGT GAT GAT T C AAT AC C T G AC 2265 

Qy 244 9 GT CC C AC AAACAC AAGAGGAGGCT GT GAT GCT CAT GAAGGAGAGT CT CACT GA A 2502 

I I I I I I I I I I I I I I I I I I II I I I I II I I I II I I I I I I I I I I I II I 
Db 22 66 GTT C CACAAAAACAAGAT GAAACT GT GAT GCT T GT GAAAGAAAGT CT CACT GAGACTT CA 2325 

Qy 2503 GTGTCTGAGACAGTAGCCCAGCACAAAGAGGAGAGACTTAGTGCCTCACCTCAGGAGCTA 2562 

I I Ml I IN I I I I I I I I I I I II I III III I 

Db 232 6 TTTGAGTCAATGATAGAATATGAAAATAAGGAAAAACTCAGTGCTTTGCCACCTGAGGGA 2385 

Qy 2563 GGAAAGCCATATTTAGAGTCTTTTCAGCCCAATTTACATAGTACAAAAGATGC TGCA 2619 

I I I I I I I I I I I I I I M II I I I I I I I || I I M I I I I II II I I I I I I 
Db 238 6 GGAAAGCCATATTTGGAATCTTTTAAGCTCAGTTTAGATAACACAAAAGATACCCTGTTA 2 445 

Qy 2 620 TCTAATGACATTCCAACATTGACCAAAAAGGAGAAAATTTCTTTGCAAATGGAAGAGTTT 2 67 9 

II I I I I M I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I M I I III I 

Db 2446 CCTGATGAAGTTTCAACATTGAGCAAAAAGGAGAAAATTCCTTTGCAGATGGAGGAGCTC 2505 

Qy 2680 AATACTGCAATTTATTCAAATGATGACTTACTTTCTTCTAAGG7\AGACAAAATAAAAGAA 2739 

I I I I I I I I I II I II I I I I I I I I I I I I I I || | | || | | | | | | | | | | | | | | | | 
Db 2506 AGT AC T G C AGT T TAT T C AAAT GAT GACT TAT T TAT T T CT AAG GAAG C AC AG AT AAGAGAA 2565 

Qy 274 0 AGT GAAAC ATT T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C C AC GT TT GT C 2799 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I II II II II II 
Db 2566 ACT GAAAC GT T T T C AGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C CT AC AT T GAT C 2625 

Qy 2 8 00 AGT GCT AAAGAT GATT C TCCTAAATTAGCCAAGGAGTACACTGATCTAGAAGTATCC 2856 

I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 262 6 AGT T CT AAAACT GAT T CAT T T T CT AAAT TAG C CAG GGAAT AT ACT GAC CT AGAAGT AT C C 2 68 5 

Qy 2857 GACAAAAGT GAAAT T GC T AAT AT C CAAAGC GG G G C AGATT CATT GC CT T G CT T AGAAT T G 2916 



2 68 6 C ACAAAAGT GAAAT T GC TAAT G C C C C GGAT GGAGC T G G GT CAT T GC C T T GC AC AGAAT T G 2745 

2917 C C CT GT GAC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAG TACATGTTTCA 2970 

III I I I I I I I I II I I I I I I I I I I I I I I I I I | | | | | | | | | | | 

274 6 C C C CAT GAC CTTTCTTT GAAGAAC AT AC AAC C C AAAGT T G AAGAGAAAAT C AGT T T C T CA 2805 

2971 GATGAATT C T C C GAAAATAG GT C C AGT GT AT CTAAGG C AT C CAT AT C GC CT T CAAAT GT C 3030 

M I I I I I M I I I I I I I I I I I I I I I I I I III il in 

2806 GAT GAC TT T T CTAAAAAT GGGT CT G C T AC AT CAAAGGT G C T C T TAT T GC C T C C AGAT GT T 2 8 65 

3031 TCTGCTTTG GAAC C T C AGAC AGAAAT GGG C AGCAT AGT T AAAT C CAAAT C AC T T AC GAAA 3090 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I | M III I I I I 
2866 TCTGCTTTGGC C AC T C AAGC AGAGAT AGAGAGCAT AGTTAAAC C CAAAGTT C T T GTGAAA 2925 

3091 GAAGCAGAGAAAAAACTTCCTTCTGACACAGAGAAAGAGGACAGATCCCTGTCAGCTGTA 3150 

I I I I I I I M I I I I I I I I I I I I I II I I I I I I I I I I I I I I | | | | | | || |M I I 
292 6 GAAGCT GAGAAAAAAC TTCCTTCC GAT AC AGAAAAAGAGGAC AGAT CAC CAT C T G CT AT A 2 985 

3151 T T GT C AGC AGAGCT GAGTAAAAC T T C AGT T GT T GAC C T C CT C TAC T GGAGAGAC ATT AAG 3210 

M I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I | I I I I I I I I I I I M I II I I 
298 6 T T T T C AGC AGAG CT GAGTAAAACTT C AGT T GT T GAC CT C CT GTAC T GGAGAGAC ATT AAG 3045 

3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

M I II I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I M || | | | | | | | | | | | | | 
3046 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

Mill II I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I M I | | | | 
3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

3331 ATATATAAGGGCGT GATCCAGGCT AT C CAGAAAT CAGAT GAAGGCCACC CATT CAGGGCA 3390 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I II I I I I M I I I I I I I I M I I I I I I 
3166 AT AT ACAAGG GT GT GAT C CAAGC T AT C CAGAAAT CAGAT GAAGGC C AC C CAT T C AGG GCA 3225 

3391 TAT T T AGAAT CT GAAGT T GC TAT AT C AGAGGAAT T GGT T CAGAAAT AC AGT AAT T CT G CT 3450 

Ml I I I I I I I I I I I I II I I I I I M I I I I I I I I I I I I I I I I I | | | | | | | | || || | | 
3226 TAT C T GGAAT CT GAAGT T GC TAT AT C T GAGGAGT T G GT T C AGAAGT AC AGTAAT T CT G CT 3285 

3451 CT T G GT CAT GT GAAC AG CAC AAT AAAAGAACT GAGGC GGC T T T T C T T AGTT GAT GAT T T A 3510 

I I I M I I I I I I I I I I I I I I I I I I I | | I I I I I I I I II I I I I | | | | | | | | | || | | | 
32 86 C T T GGT CAT GT GAACT GC AC GATAAAG GAAC T C AGGC GC C T C T T CT T AGTT GAT GAT T T A 334 5 

3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I M I I I II II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 1 || | | | | | | | || 
334 6 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 34 05 

3571 AAT GGT CT GACAC TAC T GAT TT T AGCT CT GAT C T CAC T CT T C AGT AT T C CT GT T ATT TAT 3630 

I I I I I I I I M I I I I I I I I I II I I I I I I I || M I I I I I I I I I I I I I I I II II I I I I I 
34 06 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 34 65 

3631 GAAC G GC AT CAG GT GC AGAT AGAT CAT TAT CT AG GACT T G CAAACAAGAGT GT TAAGGAT 3690 

M II I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I | | | | | M I I I III 
34 66 GAACGGCAT CAGGC GCAGAT AGAT CATTAT CTAGGACTT GCAAATAAGAAT GT TAAAGAT 3525 

3691 G C CAT G GC C AAAAT C CAAGCAAAAAT CC CT G GAT T GAAGC GCAAAGC AG A 374 0 
M INN I I I I I I II I I I I I M I | I I I M | || | | | M I I I I I I I I || 



3526 G C TAT GGC TAAAAT C CAAG CAAAAAT C C CT G GAT T GAAG CGCAAAG CT GA 3575 



RESULT 9 

US-10-267-502-212 

; Sequence 212, Application US/10267502 

; Publication No. US20040071700A1 

; GENERAL INFORMATION: 

; APPLICANT: Kim, Jaeseob 

; APPLICANT: Galant, Ron 

; TITLE OF INVENTION: Obesity Linked Genes 
; FILE REFERENCE: LSD-07416 

; CURRENT APPLICATION NUMBER: US/10/267,502 

CURRENT FILING DATE: 2003-01-27 
; NUMBER OF SEQ ID NOS : 439 
; SOFTWARE: Patent In version 3.2 
; SEQ ID NO 212 

LENGTH: 3579 

TYPE: DNA 
; ORGANISM: Homo sapiens 
US-10-267-502-212 

Query Match 61.2%; Score 2289.2; DB 12; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 



Matches 


2925; Conservative 0; Mismatches 548; Indels 117; Gaps 


19 


Qy 


253 


AT G GAAGAC AT AGAC C AGTC GT C G C T GGT CTCCTCGTC CAC GGAC AGC CCGCCCCGGCCT 

1 M 1 II II 1 1 II II M M 1 1 1 1 1 1 1 I I I | | | | | | | | | | | | | | | || || | | 

ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 


312 


Db 


1 


57 


Qy 


313 


CCGCCCGCCTT CAAGT AC CAGT T C GT GAC G GAGC C C GAG GAC GAGGAGGAC GAG GAGGAG 

1 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 M | | | | | | | | | || | || | | | | | | || | | | 

C AGC C C GC GT T CAAGT AC CAGTT C GT GAGGGAGC C C GAGGAC GAGGAG GAAGAAGAG 


372 


Db 


58 


114 


Qy 


373 


GAGGAGGAC GAGGAGGAGGACGAC GAGGACCTAGAGGAACT GGAGGT GCT GGAGAGGAAG 

Ml MMIM Ml Mill Mill IMMM MIIMI 

GAGGAGGAAGAGGAGGAC GAGGAC GAAGACCT GGAGGAGCT GGAGGT GCT GGAGAGGAAG 


432 


Db 


115 


174 


Qy 


433 


CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 

M M M 1 1 II IMM 1 1 1 1 II II 1 1 1 1 1 1 M M 1 

CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 


486 


Db 


175 


234 


Qy 


487 


CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 

1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 M 1 II M II 1 II 1 1 1 M II 1 II 1 II 

ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 


546 


Db 


235 


294 


Qy 


547 


CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 

M 1 II 1 M 1 1 1 II || || | M II 1 M 1 II 1 M 1 | | | M 1 

CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 


597 


Db 


295 


354 


Qy 


598 


GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 

1 1 1 1 1 1 1 1 1 1 M 1 1 II I I I II 1 1 1 1 | | || | | M 

GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 


657 


Db 


355 


414 


Qy 


658 


CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 

1 1 1 1 1 N 1 1 II 1 II M II 1 Ml MM II Ml 

CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 


711 


Db 


415 


474 



Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I I I I I I I I I Mill I I I I I I I I I I I I II I I I I 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I I M I I I I I I I M I I I I I I I I I I I I I I I I I I II II I I I I I I I || | | | | || M | | | | | 
Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 808 GT GAT AC CCTCCTCT GC AGAAAAAAT TAT GGAT TT GAT GGAG C AG C C AG GTAAC ACT GTT 8 67 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I M I I I I I I I I I I I I I I I | | | | || 
Db 595 GTGATACGCTCCTCTGCAGAAAA TAT GGACTTGAAGGAGCAGC CAGGTAACACT ATT 651 

Qy 868 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I M I I I I I I I M I I I I I I I I I I I I I I I I II I I M I II I I I I I I I II II I I I I I 
Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M II I I I I I I I I I I I I I I I II I MINIUM I II I I I I I I I I I || I || || 
Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 7 71 

Qy 988 GT GT CAT C CT C AGAAGGAACAAT T GAAGAAAC T T TAAAT GAAG CT T CT AAAGAGT T GC C A 104 7 

M I I M I I I I I I I I M II I I I I I I I I I I I | || | | || M | || | | | || 
Db 772 GT AT T AC C C ACT GAAGGAACAC T T C AAGAAAAT GT CAGT GAAGCT T CTAAAGAGGT CT C A 831 

Qy 104 8 GAGAGGGCAACAAATCCATTTGTAAATAGAGATTTAGCAG?J\TTTTCAGAATTAGAATAT 1107 

MM I M M M M I I II I I I M I I I I II MM II I I II I M M I II II I 

Db 832 GAGAAGGCAAAAACTCTACT CAT AGATAGAGATTTAACAGAGTTTT CAGAATT AGAATAC 8 91 

Qy 110 8 T C AGAAAT G GGAT CAT CT T T T AAAG G CT C C C CAAAAGGAGAGT C AGC CAT AT T AGTAGAA 1167 

M II I II II II I II II I M I I I II I I II I II III II II I II I II I M I 
Db 892 T C AGAAAT G GGAT CAT C GT T CAGT GT CT CT C CAAAAGC AGAAT CT GC C GT AAT AGTAGCA 951 

Qy 1168 AACACTAAGGAAGAAGTAAT T GT GAGGAGT AAA GACAAAGAGGAT T T AGT T T GT AGT 1224 

II II I I I I I II I M I I I I I I M I I I II I I I I M I II I I 

Db 952 AAT C CT AG G GAAGAAAT AAT C GT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1011 

Qy 1225 G CAG C C CT T CACAGT C C ACAAGAAT C AC CT GT GGGT AAAGAAGAC 1269 

I I I I I I I II II I I I I I II II I M I II II I I I 

Db 1012 AAC AT C C T T CAT AAT CAACAAGAGT T AC CT AC AG CT CT T ACTAAATT GGT TAAAGAGGAT 1071 

Qy 1270 AGAGT T GT GT CT C C AGAAAAGACAAT GGACAT T T T TAAT GAAAT G C AGAT GT CAGT AGT A 1329 

M I I II I I M I II I I II III M II I I I II II II I I I I I I II I I 

Db 1072 GAAGT T GT GT CT T C AGAAAAAG C AAAAGACAGT TT TAAT GAAAAGAGAGT T G CAGT GGAA 1131 

Qy 1330 GC AC CT GT GAG G GAAGAGT AT GC AGACT TT AAGC C AT T T GAACAAGC AT GG GAAGT GAAA 1389 

M Ml M I M II II II M II I II I I II I II I II II I II I I II I I || || I || 
Db 1132 G CT C CT AT GAGGGAGGAAT AT G C AGACT T CAAAC CAT T T GAG C GAGT AT GGGAAGT GAAA 1191 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

MM I || I I I II I II I I I I II I || I M I I || || 

Db 1192 GATA GT AAGGAAGAT AGT GAT AT GTTGGCTGCTG GAG GTAAAAT C GAGAGCAAC T T G 124 8 

Qy 14 38 GAAAGTAAAGT G GACAGAAAAT GC T T G GAAGAT AG C CT G GAGCAAAAAAGT CT T G GGAAG 14 97 

I M M I I I || M II I II I I M II I I I I II II II II II II I III I II 
Db 1249 GAAAGTAAAGT GGAT AAAAAAT GT T T T GC AGAT AG C CT T GAG C AAACT AAT CAC GAAAAA 1308 



Qy 


1498 


GAT AGT GAAG G C AGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C CAGAAC C T GT GAAG GAC 1557 

N 1 1 1 1 1 1 1 1 1 1 1 1 1 1 III 1 | II II 1 1 II 1 II 1 1 1 MINI 1 1 1 1 | 1 1 

GAT AGT GAGAGT AGTAAT GAT GAT AC TTCTTTCCC CAGT AC G C C AGAAGGT ATAAAGGAT 1368 


Db 


1309 


Qy 


1558 


AGC T C C AGAGC AT AT AT T AC CT GT GCTTCCTT T AC C T C AGCAAC C GAAAGC AC C AC A 1614 

1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 I II 1 1 1 II II 1 1 1 1 II || | | | | || || 
C GT C C AG GAG CAT AT AT C AC AT GTGCTCCCTT T AAC C CAGC AGCAACT GAGAG CAT T GC A 1428 


Db 


1369 


Qy 


1615 


G C AAAC AC TTTCCCTTTGT T AGAAGAT CAT AC T T C AGAAAAT AAAAC AGAT GAAAAAAAA 1674 

1 1 1 1 1 1 III 1 II I I I I I I | | | | | 1 1 1 1 1 1 1 1 M I I I I | | || | | | M 1 1 1 1 1 1 1 
ACAAACATT TT T C CTTT GTTAGGAGAT CCTACTT CAGAAAATAAGAC C GATGAAAAAAAA 14 8 8 


Db 


1429 


Qy 


1675 


ATAGAAGAAAGGAAGGCC CAAATTATAACAGAGAAG ACTAGC CCCAAAACGT CAAAT 1731 

1 ' 1 1 1 1 , , 1 1 | | | | | | || | | | | | M 1 1 1 1 1 1 1 1 II 1 1 II 1 II II II II 1 1 1 
AT AGAAGAAAAGAAG GCC CAAAT AGTAACAGAGAAGAAT ACT AGCAC CAAAAC AT CAAAC 1548 


Db 


1489 


Qy 


1732 


C CTTT C CTT GTAGCAGTACAGGATT CT GAGGCAGAT TAT GTT ACAACAGATACCT TAT CA 1791 

IMII MINIMI! 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 II III II 

CCTTTTCTTGTAGCAGCACAGGATTCTGAGACAGATTATGTCACAACAGATAATTTAACA 1608 


Db 


1549 


Qy 


1792 


AAGGT GACT GAGGCAGCAGT GT CAAAC AT GC CT GAAGGT CT GAC GCC AGAT TT AGT T C AG 1851 

1 1 1 1 1 1 1 1 II 1 M | | | || | | | | | | | | | M 1 1 1 II 1 1 1 1 1 1 1 1 | | | Ml 

AAG GT GAC T GAG GAAGT C GT GG C AAACAT GC CT GAAGGC CT GACT CC AGAT T T AGT ACAG 1668 


Db 


1609 


Qy 


1852 


GAAGCAT GT GAAAGT GAACT GAAT GAAGC CACAGGTACAAAGATT GCT TAT GAAACAAAA 1911 

N 1 1 M M 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 | | || 1 1 M 1 M 1 II 1 1 1 1 1 1 1 1 1 

GAAGCAT GTGAAAGT GAATT GAAT GAAGT TACT GGTACAAAGATT GCT TAT GAAACAAAA 172 8 


Db 


1669 


Qy 


1912 


GT G GAC T T GGT C CAAACAT C AGAAGC T AT ACAAGAAT CACT T T AC CC C AC AGC AC AGC T T 1971 

1 1 1 M II 1 1 1 1 1 1 1 II 1 | | || | | Ml | M M 1 1 II 1 II M 1 M | | | | | | | | 

AT GGAC T T GGT TCAAACAT C AGAAGT TAT GCAAGAGT CACT CT AT C CT G C AGC AC AGCTT 17 8 8 


Db 


1729 


Qy 


1972 


T G CC CAT CAT T TGAGGAAG C T GAAGCAACT C C GT C AC CAGT T T T GC CT GAT AT T GT TAT G 2031 
M 1 1 M M 1 M 1 1 1 M 1 1 1 1 1 I I I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | || | | | | | | | 
T G CC CAT CAT T TGAAGAGT C AGAAGC TACT C CT T CAC CAGT T T T G C CT GAC AT T GT TAT G 184 8 


Db 


1789 


Qy 


2032 


GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 
1 1 1 1 1 1 1 1 1 1 1 MINI 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I | | | | | | | M || | 
GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 1908 


Db 


1849 


Qy 


2092 


T C C C CAC T GGAAG CAC C T C CT C CAGT T AGT TAT GAC AGT ATAAAG CT T GAGC C T GAAAAC 2151 

II Ml 1 1 1 1 1 II || | | | | || | | | | | | || Mill | | | | | | | || || | | | 

T CAC CAT T AGAAG C T T C T T C AG T T AAT TAT GAAAG CAT AAAAC AT GAG C C T GAAAAC 1965 


Db 


1909 


Qy 


2152 


C C C C CAC CAT AT GAAGAAGC CAT GAAT GT AGC ACT AAAAGC T T T G G GAACAAAGGAA 2208 

M II 1 1 1 II M 1 II 1 1 1 1 II 1 1 II II 1 1 1 | | | Ml!! | | | | | | | || | | | 

C C C C CAC CAT AT GAAGAGG C CAT GAGT GT AT C ACT AAAAAAAGT AT C AG GAATAAAGGAA 202 5 


Db 


1966 


Qy 


2209 


GGAAT AAAAGAGC CT GAAAGT TT T AAT GC AGCT GTT CAGGAAAC AGAAG CT C C T TAT AT A 22 68 

1 IN M 1 1 M II 1 1 M 1 1 1 II 1 1 1 M M 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 

GAAAT T AAAGAGC CT GAAAAT AT TAAT G C AGC T CTT CAAGAAAC AGAAGC T C C T TAT ATA 2085 


Db 


2026 


Qy 


2269 


T C CAT T G C GT GT GAT T T AAT TAAAGAAACAAAGCT C T C CAC T GAGC C AAGT C C AGAT T T C 232 8 

M 1 1 1 M II 1 1 1 1 1 II 1 1 | | || || | | | || | | | | || nil in in | | | | | | 

T CT AT T GCAT GT GAT TT AAT TAAAGAAACAAAGCT T T C T G CT GAAC C AGCT C C GGAT T T C 2145 


Db 


2086 


Qy 


2329 


T C TAAT TAT T C AGAAAT AG CAAAAT T C GAGAAGT C G GT GC C C GAACAC GCT GAG C T AGT G 238 8 



M I I I I I I I I I I I I I I I I I I I I I M I I I I II I I I I I I I I I II I I I M 

Db 2146 T C T GATT AT T C AGAAAT GG CAAAAGT T GAAC AG C C AGT GC CT GAT CAT T CT GAGCTAGT T 2205 

Qy 238 9 GAG GATT C CT C AC C T GAAT CT GAAC C AGT T GACT T AT T T AGT GAT GAT T C GAT T C CT GAA 244 8 

I I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I II I I I I I I I I I II IMM 

Db 2206 GAAGAT T C C T C AC C T GAT T CT GAAC C AGT T GAC T TAT T T AGT GAT GAT T CAAT AC C T GAC 22 65 

Qy 244 9 GT C C CAC AAAC ACAAGAGGAGGC T GT GAT G CT CAT GAAG GAGAGT C T C ACT GA A 2502 

II I M I I I I MINI II I I II I I I | | | I I I I II | I I I I I I I I I I | 

Db 2266 GT T C CACAAAAACAAGAT GAAAC T GT GAT G C T T GT GAAAGAAAGT C T C ACT GAGACT T CA 2325 

Qy 2503 GT GT CT GAGACAGT AGC C C AGC ACAAAGAG GAGAGACT T AGT GC C T CAC CT C AGGAGC T A 2562 

I I I I I I I M I I M I I II I I I I I I Ml Ml I 

Db 2326 TTT GAGT CAAT GATAGAATAT GAAAATAAGGAAAAACT CAGT GCTTT GCCACCT GAGGGA 2385 

Qy 2563 GGAAAGC CAT AT T T AGAGT C T TT T C AGC C CAAT T T ACAT AGT ACAAAAGAT GC TGCA 2619 

I M II II II II II I I I I M I M II I M I I I I I I I I I I || I I I | | | 
Db 238 6 GGAAAGC CAT ATT T G GAAT CT TT TAAGCT CAGT T TAGATAAC ACAAAAGAT AC C C T GT T A 2445 

Qy 2 62 0 T C T AAT GAC AT T C CAAC AT T GAC CAAAAAGGAGAAAAT TT CT TT GCAAAT GGAAGAGT T T 2679 

M I I I I I I M M I I I I I M I I I I I I II I I I I I I || | | || | | M I I III I 
Db 2446 C C T GAT GAAGT TT CAAC ATT GAGC AAAAAGGAGAAAAT T C CT TT GC AGAT GGAGGAGCT C 2505 

Qy 2680 AAT ACT G CAAT TT AT T CAAAT GAT GACTT ACT T T CT T CTAAGGAAGACAAAATAAAAGAA 2739 

I I I I M II I I I I I I I I I I II I I II I I II I I I I II I I II I I I I || || I I I I 
Db 2506 AGT ACT G C AGT TT AT T CAAAT GAT GACT TAT T T ATT T CTAAGGAAG CAC AGAT AAGAGAA 2565 

Qy 2740 AGT GAAACATT T T C AGAT T CAT CT C C GAT T GAGAT AAT AGAT GAAT T T C C CAC GT TT GT C 2799 

I I I I M I M II II II II I II M II I I II I II II I I I I II II II II || || 
Db 2 566 ACT GAAACGTTTT CAGATT CAT CT CCAATT GAAATTATAGAT GAGTT C C CT ACATT GAT C 2 625 

Qy 2 800 AGT G C T AAAGAT GAT T C T C CT AAAT T AGC CAAGGAGT AC ACT GAT C T AGAAGT AT C C 2856 

III I I I M I M I I I I I I I I M I I II I I I II I I I I I I I I I I || I 

Db 2 62 6 AGT T C TAAAACT GAT T CAT T T T C TAAAT TAG C C AGGGAAT AT ACT GAC C T AGAAGT AT C C 2685 

Qy 2857 GACAAAAGT GAAAT T GC T AATAT C C AAAGC GG GG C AGAT T CAT T GC CT T G CT T AGAAT T G 2916 

I N I M I I I I II I I II I II I I | || I II I II I Ml 

Db 2686 CACAAAAGT GAAAT T GCTAAT GC C C C GGAT GGAGC T GG GT CAT T GC C T T G CAC AGAAT T G 2745 

Qy 2 917 CC CTGT GAC CTTT CT TT CAAGAATATATATCCTAAAGATGAAG T ACAT GT T T C A 297 0 

I I I II I I II I I I I I I I I II I II I I I I II M I I I I I I 

Db 2 74 6 C C C CAT GAC CTT T CTTT GAAGAAC AT ACAACC C AAAGT T GAAGAGAAAAT CAGTT T CT C A 2805 

Qy 2971 GAT GAAT T CT C C GAAAAT AG GT C CAGT GT AT CT AAGGCAT C CAT AT C G C CT T CAAAT GT C 3030 

I I I I I I I I I M I M M I I I II I I I I I I I || | | | | | | | || | 

Db 2806 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2865 

Qy 3031 T CT GC T TT G GAAC CT CAGAC AGAAAT GGGC AG CAT AGT TAAAT C CAAAT C ACT T AC GAAA 3090 

M I I I II II I I II I I I II I I | II II I I I I I I II II I I I III II I I 
Db 2866 TCTGCTTTGGC CAC T CAAG C AGAGAT AGAGAG CAT AGT TAAAC C CAAAGTT C T T GT GAAA 2925 

Qy 3091 GAAGCAGAGAAAAAAC T T C C T T CT GAC AC AG AGAAAG AGGAC AGAT C C CT GT C AGC T GT A 3150 

I I I I I I I I II I I II I M I M I I M M I II II II I I II I || Ml || 

Db 2 92 6 GAAGC T GAGAAAAAACT T C CT T C C GAT AC AGAAAAAGAGGAC AGAT CAC CAT C T GC TAT A 2 985 

Qy 3151 T T GT C AG C AGAG CT GAGT AAAACT T CAGT T GT T GAC C T C C T C T ACT G GAGAGAC AT TAAG 3210 

H I M I I I I II II II II II II I II I II I I II II I II I I II I II I I I II I I I I I I I I I I 



Db 



2 986 TT T T CAGC AGAGC T GAGTAAAAC T T C AGT T GT T GAC CT C C T GT ACT G GAGAGAC AT T AAG 3 045 



Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 

I I I I I I I I I I I I I I I M I I I I I I I II I I I I I I I I II || I I || | M I I I I I I I I I I 
Db 3046 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I M I I M Mill I I I I I I I I I I II I II I I I I I I I || I I I I | | | | | I | | | | | | | | | 
Db 3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

Qy 3331 AT AT ATAAGG GC GT GAT C CAGGCT AT C C AGAAAT C AGAT GAAGGC C AC C CAT T C AGGGC A 3390 

I I I I I M I I I I I I I I I I I I I I I I I I I I M II I I II I I I I I I I I I I I M I M I I | || | 
Db 3166 AT AT ACAAGGGT GT GAT C CAAG C TAT C C AGAAAT CAGAT GAAGGC C AC C CAT T C AGGGC A 3225 

Qy 3391 TAT T T AGAAT CT GAAGT T G CT AT AT CAGAGGAATT GGTT C AGAAAT AC AGT AATT CT GC T 3450 

Ml I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I || || I | I I I I II I I I I I I I I I 
Db 3226 TAT CTGGAAT CT GAAGTT GCTATAT CT GAGGAGTT GGTT CAGAAGTACAGTAATT CTGCT 3285 

Qy 3451 CT T GGT CAT GT GAACAGCACAAT AAAAGAACT GAGGC GG C T T T T C T T AGT T GAT GATT T A 3510 

HI Mill I I II Mill I I I I I I I II I II I II I I I I I II I I I 

Db 3286 CT T GGT C AT GT GAACT GC AC GAT AAAGGAACT C AG GC GC CT CT T CT T AGT T GAT GAT T T A 3345 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

M M II M I I I I I I I II I I I I I I II I I I I II II I I I II I I I || II II I I I I M I I I 
Db 334 6 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3405 

Qy 3571 AAT GGT CT GAC ACT AC T GAT T T T AG CT CT GAT CT CAC T C TT CAGT AT T C C T GT TAT T TAT 3630 

I M M I I I I I I I M I I I I I I I II Mill II I I II II M II M I M II II II II II I 
Db 3406 AAT GGT CTGACACTACTGATTTTGGCTCTCATTTCACTCTT CAGT GTTCCTGTTATTTAT 3465 

Qy 3631 GAACGGCAT CAGGT G CAGAT AGAT CATTAT CTAGGACTT GCAAACAAGAGT GTTAAGGAT 3690 

I I M I M M II I I II I M I I II II II I I I II I II I I I I I I I I I I I I I I I I I I | IN 
Db 3466 GAACGGCAT CAGGC GCAGATAGAT CATTAT CTAGGACTT GCAAATAAGAAT GTTAAAGAT 3525 

Qy 3 691 GC C AT G GC CAAAAT CCAAGCAAAAAT C C C T GGAT T GAAG C GCAAAG C AG A 374 0 

M I M M I II I II II I I I I I I I I I II I I M I I II I I I I I I I II I I II 
Db 352 6 GCT AT GG CT AAAAT C CAAG CAAAAAT C C C T GGAT T GAAGC G CAAAGCT GA 3575 



RESULT 10 
US-10-327-213-8 

; Sequence 8, Application US/10327213 

; Publication No. US20040121341A1 

; GENERAL INFORMATION: 

; APPLICANT: FILBIN, MARIE T. 

; APPLICANT: DOMENICONI, MARCO 

; APPLICANT: CAO, ZIXUAN 

; TITLE OF INVENTION: INHIBITORS OF MYELIN-ASSOCIATED GLYCOPROTEIN (MAG) 

; TITLE OF INVENTION: ACTIVITY FOR REGULATING NEURAL GROWTH AND REGENERATION 

; FILE REFERENCE: CUNY/ 003 

; CURRENT APPLICATION NUMBER: US/10/327,213 

; CURRENT FILING DATE: 2002-12-20 

; NUMBER OF SEQ ID NOS : 4 3 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 8 

LENGTH: 3579 

TYPE: DNA 



; ORGANISM: Homo sapiens 
US-10-327-213-8 



Query Match 61.2%; Score 2289.2; DB 17; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19 

QY 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I I I I I I I I I I I I ! I I II I I I I I I I I | I | | | | | | | | | | 
Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

QY 313 CCGCCCGCCTT CAAGT AC CAGT T C GT GAC GGAGC C C GAG GAC GAG GAGGAC GAGGAGGAG 372 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II II I II 
Db 58 C AG CCCGCGTT C AAG T AC CAGT T C G T GAG G GAG C C C GAG GAC GAG GAG GAAGAAGAG 114 

Qy 373 GAG GAG GAC GAG GAG GAG GAC GAC GAG GACC TAGAGGAACT G GAG GT GCT GGAGAG GAAG 432 

I I I I I I I I MINIM II I I 11 I I I I II I I I I I I I II I I II I II I I I I I I I I I I 
Db 115 GAG GAG GAAGAG GAGGAC GAGGAC GAAGAC C T GGAGGAGCT GGAG GT GCT G GAGAGGAAG 174 

Qy 433 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

Mill I I I I I I || | || I II M M II I II I I I I 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 546 

M I I M I I I I I I I I I I I I || M II I I II I I I I I I I I I M II I 

Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

M I II I II I I I I I I I I I I I I I I I I I I I II I I I I I I II I 

Db 2 95 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 598 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

M M M M M II I I I I I II I I I I I I I I I I I I II I I I I I I II I I I I I I I I I I M 

Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 
Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

M M I M I MM II I I I I I M II III III I I I I I I I I Ml III 

Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 75 0 

M II I I I I I II I M I II II I II II I II I I I I 

Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

M II II I I I I II I II I I I I I I I I I || | || | || | | || || || || | M I I I I I I II I M I 
Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 808 GT GAT AC CCTCCTCTG C AGAAAAAAT TAT GGAT T T GAT GGAGC AGC C AG GT AACAC T GT T 8 67 

M M II I II I II I I || I || M I I I II II MM II I II II II I II I I I || II II 
Db 595 GTGATACGCTCCTCTGCAGAAAA TAT GGACTT GAAG GAGC AGC CAGGTAAC ACT ATT 651 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 92 7 

Ml II II II I I I II II I I II II II II I I I I I I I I I I I I I | | I I I | I | I I I I I I 

Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 928 CTATCTCCTCTCTCAACTGTTTCTTTTAAAGAACATGGATACCTTGGTAACTTATCAGCA 987 

M M I I II I II II I I I II II II II II II I || I II I I I I I I I I | | || Ml II 



Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 98 8 GT GT CAT C CT C AGAAGGAACAAT T GAAGAAACT T TAAAT GAAG CT T CTAAAGAGT T G C C A 1047 

II I I M I I I I I I I I I I II I I I I I I I I | I I I I I I I I II I I I I I I | || 
Db 772 GT AT T AC C C AC T GAAGGAACACT T CAAGAAAAT GT C AGT GAAG CT T CT AAAGAGGT C T C A 831 

Qy 104 8 GAGAGG GCAAC AAAT C CAT T T GT AAAT AGAGAT T T AGC AGAAT T T T CAGAAT T AGAAT AT 1107 

MM I I I I I M M I I M I M I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 832 GAGAAGGCAAAAACT CT ACT CAT AGAT AGAGAT T TAACAGAGT T T T CAGAAT T AGAAT AC 8 91 

Qy 1108 T C AGAAAT GG GAT CAT CT T T TAAAGGCT C C C CAAAAGGAGAGT CAG C CAT AT T AGT AGAA 1167 

I I I I I I I I I I I I I I I I I II I I III I I I I I II Ml || | | | || | M I II I 
Db 892 T C AGAAAT GGGAT CAT C GT T C AGT GTCT C T C CAAAAGC AGAAT CT G C C GT AAT AGT AGCA 951 

Qy 1168 AAC AC T AAG GAAGAAGT AAT T GT GAGGAGT AAA GACAAAGAGGAT T T AGT T T GT AGT 1224 

M M I I I I I I I I I I I I I I I I I fill II I I I I I I I I I I II I I I I 
Db 952 AAT C CT AG G GAAGAAAT AAT C GT GAAAAATAAAGAT GAAGAAGAGAAGT T AGT T AGT AAT 1011 

Qy 1225 GC AGC C CT T C AC AGT C CAC AAGAAT CACCT GTGGGTAAAGAAGAC 1269 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 

Db 1012 AAC AT C CT T CAT AAT CAACAAGAGT T AC CT ACAGCT C TT AC TAAAT T GGT TAAAGAGGAT 1071 

Qy 127 0 AGAGTT GT GT C T C CAGAAAAGACAAT GGAC ATT T T TAAT GAAAT GC AGAT GT C AGT AGT A 132 9 

I I I I M I I I I I I I I I I I IN I I I I || | | | || | | | | | | | | | | | | 

Db 1072 GAAGTT GT GT CTT CAGAAAAAGCAAAAGACAGTTTTAAT GAAAAGAGAGTT GCAGT GGAA 1131 

Qy 1330 GC AC CT GT GAGGGAAGAGT AT GCAGACT T TAAGCCAT T T GAACAAGC AT G GGAAGT GAAA 138 9 

M I I I M I I I I I II I I I M I I I I I I II I I I I I I I I I II || I I | | | | | | | | | 
Db 1132 GC T C CT AT GAGG GAG GAAT AT G CAGACT T CAAACC AT TT GAGC GAGT AT G GGAAGT GAAA 1191 

Qy 1390 GATACTTATGAGGGAAGTAGGGATGTGCTGGCTGCTAGAGCT AATGTG 1437 

I I I I Mill I I I I II II I I I II I I I I I I I M II 

Db 1192 GATA GTAAGGAAGATAGT GAT AT GT T GG C T GCT GGAGGTAAAAT C GAGAG CAACT T G 1248 

Qy 1438 GAAAGTAAAGT GG AC AGAAAAT GCTT GGAAGATAGCCTGGAGCAAAAAAGT CTT GGGAAG 1497 

I M I II I I I I I I I I I I II I I I I II I I I II I I I III I II 

Db 124 9 GAAAGTAAAGT G GAT AAAAAAT GTTTT GCAGAT AGCCTT GAGCAAACTAAT CACGAAAAA 1308 

Qy 14 98 GAT AGT GAAG GCAGAAAT GAG GAT GCTTCTTTCCC C AGT AC C C CAGAAC CT GT GAAGGAC 1557 

I I I I I I I I I M I I I I I III I I I I I I I I I M I I I I I I I I I I I I I | M I I 
Db 13 0 9 GAT AGT GAGAGT AGTAAT GAT GAT AC TTCTTTCCC CAGT AC GC CAGAAG GT AT AAAGGAT 1368 

Qy 1558 AGCTCCAGAGCATATATTACCTGTGCTTCCTTTACCT CAGCAACCGAAAGCACCACA 1614 

I I I I I M I I I II I I I I I I I I I I I | I I | I I I I I I I I I I I I I II 

Db 1369 C GT C CAG GAG CAT AT AT CAC AT GTGCTCCCTT TAAC C C AGC AG CAACT GAGAGCAT T GC A 1428 

Qy 1615 G C AAAC AC TTTCCCTTTGT T AGAAG AT CAT AC T T C AGAAAAT AAAAC AG AT GAAAAAAAA 1674 

M I I I I Ml I I I I I I M I I II I I I I I I I I I II I I I I | | | || I I I I I I I I I I I I 
Db 1429 AC AAAC AT TTTTCCTTTGT TAG GAG AT C C T AC T T C AGAAAAT AAG AC C GAT GAAAAAAAA 14 8 8 

Qy 1675 AT AGAAGAAAG GAAG G C C C AAAT TAT AAC AG AGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

M I I I I I I I I I I I I I I I II I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 14 89 AT AGAAGAAAAGAAGGCC CAAAT AGT AACAGAGAAGAAT ACTAGCAC CAAAACAT CAAAC 1548 

Qy 1732 CCTTTCCTT GTAG C AGT ACAGGAT T CT GAG GCAGAT T AT GT T ACAAC AGAT AC CT T AT C A 1791 

Mill II I II I I I I I I I I I I I I I I I I I | I I I | | | | | || | | | | M I II I II I II 
Db 1549 CCTTTTCTT GT AGCAGC ACAG GAT T C T GAGACAGAT TAT GT CAC AAC AGAT AAT T T AACA 1608 



Qy 


1792 


AAGGT GAC T GAG G C AG C AGT GT CAAACAT GC C T GAAG GT CT GAC G C C AGAT T T AGT T C AG 18 51 

M 1 1 1 1 1 1 1 1 1 1 1 II IN | | || | | | | | | | | | | | | | | | | | | I | | | | | | M 1 Ml 

AAGGT GAC T GAGGAAGT C GT GGCAAAC AT GCC T GAAGGC CT GAC T C C AGAT T T AGTAC AG 1668 


Db 


1609 


Qy 


1852 


GAAG CAT GT GAAAGT GAACT GAAT GAAG C C AC AG GT ACAAAGAT T GC T TAT GAAACAAAA 1911 

1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I M 1 1 1 1 1 II 1 1 1 

GAAG CAT GT GAAAGT GAATT GAAT GAAGT T AC T G GT ACAAAGAT T GCT T AT GAAACAAAA 1728 


Db 


1669 


Qy 


1912 


GT GGACTT GGT CCAAACATCAGAAGCTATACAAGAAT CACTT TACC CCACAGCACAGCTT 1971 

1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 II 1 1 1 1 1 1 1 1 II M 1 1 1 1 1 1 II 1 1 1 
AT GGACTT GGTT CAAACATCAGAAGTTAT GCAAGAGT CACT CTAT C CT GCAGCACAGCTT 1788 


Db 


1729 


Qy 


1972 


T GC C CAT CAT T T GAGGAAGC T GAAGCAAC T C C GT C AC C AGT T T T GC CT GAT AT T GT T ATG 2031 

1 M 1 1 II 1 1 1 1 1 II II 1 Mill 1 1 1 II II 1 1 II | || | | || 1 1 M 1 M 1 1 1 1 II 

T GC C CAT CAT T T GAAGAGT CAGAAGCT AC T C CT T C AC CAGT T T T GC CT GAC AT T GT TAT G 184 8 


Db 


1789 


Qy 


2032 


GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 
1 1 1 1 1 1 1 1 1 1 1 M M II M II 1 1 II 1 II 1 1 1 1 1 1 I 1 1 II 1 1 M II 1 
GAAGCACCATT GAATT CTGCAGTTCCTAGTGCT GGT GCTTCCGTGATACAGCCCAGCTCA 1908 


Db 


1849 


Qy 


2092 


T CC C C AC T G GAAGC AC CT C C T C CAGT T AGTT AT GAC AGT ATAAAGCT T GAGC CT GAAAAC 2151 

II II 1 M M 1 M M 1 II II M M 1 II II 1 M 1 1 II II 1 1 M 1 

T C AC CAT T AGAAG CT T CTT C AGT TAATT AT GAAAGC ATAAAAC AT GAG C C T GAAAAC 1965 


Db 


1909 


Qy 


2152 


C C C C C AC CAT AT GAAGAAG C CAT GAAT GT AG C AC T AAAAGCTTTGGGAACAAAGGAA 22 08 

1 1 1 1 M 1 1 1 M 1 M M 1 1 II 1 II 1 II M II 1 1 Mill 1 II 1 1 M 1 1 1 1 1 

C CCCCAC CAT AT GAAGAGGCCAT GAGT GTAT CACTAAAAAAAGTAT CAGGAATAAAGGAA 2 025 


Db 


1966 


Qy 


2209 


G GAATAAAAGAGC C T GAAAGT T TT AAT GC AG CT GT T C AG GAAAC AGAAGC T C CT T AT AT A 2268 

1 Ml M M 1 II II 1 1 M 1 1 1 II 1 1 II 1 1 1 MM 1 1 II M 1 1 M 1 1 M II 1 II 1 1 

GAAATTAAAGAGC CT GAAAATATTAAT GCAGCT CTT CAAGAAACAGAAGCT CCTT AT AT A 2085 


Db 


2026 


Qy 


2269 


T C CAT T GC GT GT GAT TTAAT TAAAGAAACAAAGCT CT C CAC T GAGC CAAGT C CAGAT T T C 2328 

II 1 1 1 1 1 1 M 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 | | || II 1 II II 1 1 III Ml II II 1 1 

TCTATTGCATGTGATTTAATTAAAGAAACAAAGCTTTCTGCTGAACCAGCTCCGGATTTC 2145 


Db 


2086 


Qy 


2329 


TCTAATTATTCAGAAATAGCAAAATTCGAGAAGTCGGTGCCCGAACACGCTGAGCTAGTG 2388 
1 1 1 M II 1 II 1 II M 1 II II II 1 M 1 1 1 1 1 II 1 II II II 1 II 1 II II 
TCTGATTATTCAGAAATGGCAAAAGTTGAACAGCCAGTGCCTGATCATTCTGAGCTAGTT 22 05 


Db 


2146 


Qy 


2389 


GAGGAT T C C T CAC C T GAAT CT GAAC CAGT T GACT T ATTT AGT GAT GAT T C GAT T C CT GAA 244 8 

II 1 M 1 1 II II 1 II 1 1 II 1 II 1 1 1 1 1 1 II 1 M 1 II 1 1 1 II 1 1 II II 1 1 II Mill 

GAAGAT T C C T CAC CT GAT T CT GAAC CAGT T GACT T ATT T AGT GAT GAT T CAAT AC C T GAC 2265 


Db 


2206 


Qy 


2449 


GT C C C ACAAACACAAGAGGAG GCT GT GAT GCT CAT GAAGGAGAGT C T C ACT GA A 2 5 02 

1 1 1 1 M M 1 1 II 1 II II 1 1 1 II 1 II II 1 II 1 II II II 1 II II II 1 

GT T C C ACAAAAACAAGAT GAAACT GT GAT GCT T GT GAAAGAAAGT CT C ACT G AGACT T C A 2325 


Db 


2266 


f)u 




G 1 G I CTGAGACAGTAGCCCAGCACAAAGAGGAGAGACTTAGTGCCTCACCTCAGGAGCTA 2562 

1 1 Ml 1 III II M 1 II 1 1 II 1 1 1 III Ml 1 

T T T GAGT CAAT GAT AGAAT AT GAAAAT AAG GAAAAAC T CAGT GCT T T GC CAC CT GAG GGA 2385 


Db 


2326 


Qy 


2563 


GGAAAG C CAT AT T T AGAGT CT T T T CAG C C CAAT T T AC AT AGT ACAAAAGAT G C TGCA 2619 

1 1 M II 1 1 II 1 M 1 II 1 1 1 1 II M 1 1 1 1 1 1 1 1 II II II 1 1 II 1 1 1 

G GAAAGC CAT AT T T G GAAT C T T T T AAGC T CAGT T T AGAT AAC ACAAAAGAT AC C C T GT T A 244 5 


Db 


2386 



Qy 2 620 T C T AAT GAC AT T C CAAC AT T GAC C AAAAAGGAGAAAAT T T CT T T GCAAAT G GAAGAGT T T 267 9 

II I I I I M I I I I I I I I I I M I I I | | | | | | | | | | | | || | | | | | | | | Ml | 
Db 2446 C C T GAT GAAGT T T CAAC AT T GAGC AAAAAGGAGAAAAT T C CT T T GCAGAT GGAGGAGC T C 2505 

Qy 2680 AAT AC T G CAAT T TAT T CAAAT GAT GACT T ACT T T CT T CT AAG GAAGACAAAAT AAAAGAA 2739 

I M I I I I I I I I I I I M I I I I I I I I I I I I I I | | | | | || | | | | I I I I I I I I I 
Db 2506 AGT AC T G C AGT T TAT T CAAAT GAT GACT TAT T TAT T T C T AAGGAAG C AC AGAT AAGAGAA 2565 

Qy 274 0 AGT GAAAC AT T T T C AGAT T CAT C T C C GAT T GAG AT AAT AG AT G AAT T T C C C AC GT T T GT C 2 799 

I I I I I I I I | I | | | | | | | || | | || | | | | | | || M I I I I I I || || || M II 
Db 2566 ACT GAAACGT TTT CAGATTCAT CT C CAAT T GAAAT TAT AGAT GAGTT CC CTACATT GAT C 2 62 5 

Qy 2 800 AGT GCT AAAGAT GAT TC TCC T AAAT TAG C CAAG GAGT ACAC T GAT C T AGAAGT AT C C 2856 

HI I I I I I I II I I I I I I I I I I IN || || I I I I I I I I I 

Db 2 626 AGT T CT AAAAC T GAT TC AT T TT C T AAAT TAG C CAGG GAAT AT AC T GAC C T AGAAGTAT C C 2 685 

Qy 2 857 GAC AAAAGT GAAATT GCTAAT AT C CAAAGC G GG G C AGATT CAT TGCCTTGCT T AGAAT T G 2 916 

I I I I I I I I I I I I II I II MM! I I I I | I MUM 

Db 2686 CACAAAAGT GAAAT TGCTAATGCCCCGGATGGAGCTGGGTCATTGCCTTGCACAGAATTG 2745 

Qy 2 917 C C CT GT GAC CTTTCTTT CAAGAAT AT AT AT C CT AAAGAT GAAG T AC AT GT T T C A 2970 

M I I M M I M II I I I I I II I || I M II M I II I I I Mill 

Db 2746 C C C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT C AGTT T CT CA 28 05 

Qy 2971 GAT GAATT C TC CGAAAAT AGGT C C AGT GT AT CT AAGGC AT C CAT AT C G C CT T CAAAT GT C 3030 

I I M I M M M I M M M I M I II I I I I I I I I I I M | || | 

Db 2 8 06 GAT GACT T T T CT AAAAAT G GGT CT G CT AC AT CAAAG GT G C T CT TAT T G C CT C C AGAT GT T 2865 

Qy 3031 T CT G C T T T GGAAC C T CAGAC AGAAAT GGGC AGC ATAGT TAAAT C CAAAT C AC T TAC GAAA 3090 

I I I I I I I I I I I M I M II II I I II I I I I II II I I I I II I I I MM 
Db 2 8 66 T CT G C T T T GGC C AC T CAAG C AGAGAT AGAGAG CAT AGT TAAAC C CAAAGT T CT T GT GAAA 2925 

Qy 3 091 GAAGC AGAGAAAAAACTT C C T T CT GAC AC AGAGAAAGAG GACAGAT C C CT GT C AGCT GT A 3150 

I I I I I I M I II I I M I I I I I II II I I || I | | | | | || | || II II I M Ml II 

Db 2 926 GAAGCT GAGAAAAAACT T C CT T C C GAT AC AGAAAAAGAG GACAGAT C AC CAT C T GCT AT A 2985 

Qy 3151 T T GT CAG C AGAGCT GAGT AAAACT T C AGTT GT T GAC CT C CT C TACT G G AG AG ACATT AAG 3210 

II I II M II I M I M I I I I M I M M M I I I II I II I II I I M I I I II M I I II I I II 

Db 2 98 6 T T T T CAG C AGAGCT GAGT AAAACT T C AGTT GT T GAC CT C CT GT AC T G GAGAGACAT T AAG 3045 

Qy 3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 3270 

I I I I I I I I I I I I M I II I I I I II I || II II II I I II I I II I II I I I I II 

Db 304 6 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

Qy 3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I II I I I I I M I M I I M I M II II II I I I I I I I I || || I II I M I II M M 
Db 3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

Qy 3331 AT AT AT AAG G GC GT GAT C CAG GCT AT C C AGAAAT CAGAT GAAG G C C ACC C AT T CAGGG C A 3390 

I I I I I I I I I I M I M II I II II I II I I I I I I I II II || I I M II I II I II II M I II 
Db 3166 AT ATACAAGGGT GT GAT C CAAGCT AT C C AGAAAT CAGAT GAAGG C CAC C C ATT CAGGG C A 3225 

Qy 33 91 TAT T T AGAAT C T GAAGT T GCT AT AT C AGAG GAAT T G GT T CAGAAAT AC AGT AAT T C T G C T 34 50 

M I I M M I M M II I I I I II I I I I I I I I II II II I II II I M I I I I I II 

Db 3226 T AT CT G GAAT CT GAAGT T G CT AT AT CT GAG GAGT TGGTTC AGAAGT AC AGT AAT TCT GCT 3285 

Qy 3451 C T T G GT CAT GT GAAC AG C ACAATAAAAGAACT GAGG C GGCT T T T CTT AGT T GAT GAT T T A 3510 



Db 328 6 CTT G GT C AT GT GAAC T GC AC GATAAAGGAACT C AG G C G C CT CTT C T T AGT T GAT GAT T T A 3345 

Qy 3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I I I I I I I I I I I I II I I I I I | | I I | | I I I I | I | | | | U | | | | | | | | | | | | | | | 
Db 3346 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 3405 

Qy 3571 AAT G GT C T GAC AC TACT GAT T T T AG CT CT GAT CT CACT CTT CAGT AT T C C T GTT AT TT AT 3630 

I I I I M I I I I I I I I I I I I I I I I I Mill II I I I I II I I I I I I I | | | | | || | | | | || 
Db 34 06 AATGGTCTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTAT 34 65 

Qy 3631 GAAC G GCAT CAGGT G CAGAT AGAT CAT TAT C TAGGACT T GCAAACAAGAGT GT T AAGGAT 3690 

I I I I I I I I I II I I I II I I I I I II I I I I I I I I M I I I I II I I II I I I I I | | | | | | M 

Db 3466 GAAC GG CAT CAGGC GC AGATAGAT CAT TAT CT AGGAC T T G CAAATAAGAAT GT TAAAGAT 3525 

Qy 3691 GC CAT G GC CAAAAT C CAAGCAAAAAT C C CT GGAT T GAAGC G CAAAG C AGA 374 0 

II I I I I I I I I I M I I I I I I I I I I I I I I II I II II I I I I I I I | | | | || 

Db 352 6 G CT AT G G C TAAAAT C CAAGCAAAAAT C C C T GGAT T GAAGC G CAAAG CT GA 3575 



RESULT 11 
US-10-466-258-8 

; Sequence 8, Application US/10466258 
; Publication No. US20040132096A1 
; GENERAL INFORMATION: 
; APPLICANT: GLAXO GROUP LIMITED 
; TITLE OF INVENTION: ASSAY 

FILE REFERENCE: P8 0966 GCW 
; CURRENT APPLICATION NUMBER: US/ 10/4 66, 258 
; CURRENT FILING DATE: 2003-07-15 
; NUMBER OF SEQ ID NOS : 13 
; SOFTWARE: Patentln version 3.0 
; SEQ ID NO 8 

LENGTH: 3579 
; TYPE: DNA 

ORGANISM: Homo sapiens 

FEATURE : 
; NAME/ KEY: CDS 

LOCATION: (1) . . (3579) 
US-10-466-258-8 



Query Match 61.2%; Score 2289.2; DB 17; Length 3579; 

Best Local Similarity 81.5%; Pred. No. 0; 

Matches 2925; Conservative 0; Mismatches 548; Indels 117; Gaps 19; 

QY 253 ATGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCT 312 

I I I I I I I I I I I I I M M I I I M I I I I I I I I I || | | || | | || | | | | | | || 
Db 1 ATGGAAGACCTGGACCAGTCTCCTCTGGT CTCGTCCTCGGACAGCCCACCCCGGCCG 57 

QY 313 CCGCCCGCCTT C AAGT AC C AGT T C GT GAC G GAGC C C GAGGAC GAG GAGGAC GAG GAGGAG 372 

I I M I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I | | | | | | || || | | | 
Db 58 C AG C CC G C GT T CAAGT AC CAGT T C GT GAGGGAGC C C GAGGAC GAG GAG GAAGAAGAG 114 



Qy 373 GAGGAGGAC GAG GAG GAG GAC GAC GAGGAC C T AGAG GAAC T G GAG GT G CT G GAGAGGAAG 432 

M M I I I I I I I I I I I II I I I II I I I I I I I I I I I I I 

Db 115 GAGGAG GAAGAG GAGGAC GAGGAC GAAGAC C T G GAGGAG C T GGAGGT G CT GGAGAGGAAG 174 



Qy 4 33 CCCGCAGCCGGGCTGTCCGCAGCTGCGGTGC CGCCCGCCGCCGCCGCGCCGCTG 4 86 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I 

Db 175 CCCGCCGCCGGGCTGTCCGCGGCCCCAGTGCCCACCGCCCCTGCCGCCGGCGCGCCCCTG 234 

Qy 4 87 CTGGACTTCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCC 54 6 

I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I II I I I M I I I I I 
Db 235 ATGGACTTCGGAAATGACTTCGTGCCGCCGGCGCCCCGGGGACCCCTGCCGGCCGCTCCC 294 

Qy 547 CCTGCCGCTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCG CGGCGCCC 597 

II I II I II II I I I I M M M I I I I I I I I I I I I I I MM 

Db 295 CCCGTCGCCCCGGAGCGGCAGCCGTCTTGGGACCCGAGCCCGGTGTCGTCGACCGTGCCC 354 

Qy 5 98 GCGCCATCCCTGCCGCCCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAG 657 

I M I I I I I I I II I I I I I I I I II I M I I I I I II I I I II M I I I I I I II I II II I 
Db 355 GCGCCATCCCCGCTGTCTGCTGCCGCAGTCTCGCCCTCCAAGCTCCCTGAGGACGACGAG 414 

Qy 658 CCTCCGGCGAGGCCCCCGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAG 711 

I I M II I I I I I I II I I I I I II II I II III I I I I I I I I III III 
Db 415 CCTCCGGCCCGGCCTCCCCCTCCTCCCCCGGCCAGCGTGAGCCCCCAGGCAGAGCCCGTG 474 

Qy 712 ■ CCCGCCGCGCCCCCTTCCACGCCGGCCGCGCCCAAGCGC 750 

I I I I I I II I I I I I I I I I I I II I I I I I II I I I I I I I I I 
Db 475 TGGACCCCGCCAGCCCCGGCTCCCGCCGCGCCCCCCTCCACCCCGGCCGCGCCCAAGCGC 534 

Qy 751 AGGGGCTCC GGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 807 

I M I I I I M I II I I II I M I M I I I I I M II I I M M 1 I I M II II II I 1 M I M I I 

Db 535 AGGGGCTCCTCGGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTGCTGCATCTGAGCCT 594 

Qy 8 08 GT GAT AC CCTCCTCTG CAGAAAAAAT TAT GGAT T T GAT G GAGCAGC C AGGTAAC ACT GT T 8 67 

I I I II I I II M I I I I I II I I II I I II II I I I I I I I I II II I I II I I II I I I II 

Db 595 GTGATACGCTCCTCTGCAGAAAA TAT GGACTT GAAGGAGCAGCCAGGT AACACT ATT 651 

Qy 8 68 TCGTCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCCTCTCTTCCTTCT 927 

Ml I I I I I II II I I I M M I I I I M II I II I I I I I II I I I I I I II I I I I I I II I M I I 
Db 652 TCGGCTGGTCAAGAGGATTTCCCATCTGTCCTGCTTGAAACTGCTGCTTCTCTTCCTTCT 711 

Qy 92 8 CTAT CTCCTCTCT CAACT GTTTCTTT TAAAGAACAT G GAT ACCT T GGTAACT T AT CAGCA 987 

M M M M M M M I I I II M I M M M I II I I M M M I M M I I III M 

Db 712 CTGTCTCCTCTCTCAGCCGCTTCTTTCAAAGAACATGAATACCTTGGTAATTTGTCAACA 771 

Qy 98 8 GT GT CAT C CT CAGAAGGAACAAT T GAAGAAAC T T T AAAT GAAG CT T CT AAAGAGT T G C CA 104 7 

II I I II I II I M I II I M I I I I I I I I I I I I II I II I II M I I I I II 

Db 772 GTAT TAC C CACT GAAG GAAC ACT T CAAGAAAAT GT CAGT GAAG CT T C TAAAGAG GT C T C A 831 

Qy 1048 GAGAGGGCAACAAAT CCATT T GT AAATAGAGATTTAGCAGAATTTTCAGAATTAGAAT AT 1107 

I I I I I I I I I II II I I M I II I I I I I I II MM I M I II I II I M I I I I I 

Db 832 GAGAAGG CAAAAACT CT ACT C AT AGAT AGAGATT T AAC AGAGTTT T C AGAAT TAGAAT AC 891 

Qy 1108 T CAGAAAT GGGAT CAT CT T T TAAAGGC T C C C CAAAAG GAGAGT CAGC CAT AT TAGT AGAA 1167 

II I I I M II M I M I M II I ! Ml M M I II Ml II Ml M I M M I I 

Db 892 T CAGAAAT GGGAT CAT CGT T C AGT GT C T CT CCAAAAGCAGAAT CT GC C GTAAT AGTAG CA 951 

Qy 1168 AACACT AAGGAAGAAGTAAT T GT GAGGAGT AAA GACAAAGAG GAT T TAGT TTGT AGT 1224 

II M I II I I II I I I I I I I II Mill II I II I I I II I II I I I I I 
Db 952 AAT C CTAGG GAAGAAAT AAT CGT GAAAAAT AAAGAT GAAGAAGAGAAGT T AGTT AGT AAT 1011 



Qy 1225 G CAGC C CT T C AC AGT C C AC AAGAAT CAC CT 



GT GGGT AAAGAAGAC 1269 



1012 AAC AT C CT T C ATAAT C AAC AAGAGT T AC C T AC AG CT CT T ACTAAAT TG GTTAAAGAGG AT 1071 

1270 AGAGTT GT GT CT C CAGAAAAGACAAT GGACATTTTTAAT GAAATGCAGAT GTCAGTAGTA 1329 

N I I I I I M I I I I I I I | Ml I I I I I I I I I I I I I I I I I I I I I | | 

1072 GAAGTTGTGTCTTCAGAAAAAGCAAAAGACAGTTTTAATGAAAAGAGAGTTGCAGTGGAA 1131 

1330 GCACCTGTGAGGGAAGAGTATGCAGACTTTAAGCCATTTGAACAAGCATGGGAAGTGAAA 138 9 

'I HI I I M I M II I I II I I I II I I II MINIM I || || I | || M I II I I 
1132 GCT C CT AT GAGG GAG GAAT AT GC AGAC T T CAAAC CAT T T GAG C GAGT AT GGGAAGT GAAA 1191 

1390 GAT ACTT AT GAGGGAAGT AGGGAT GT GCT GGCT GCT AGAGCT AATGTG 1437 

I I I I I I I I I II I M I I I M I II I I I II I I || || 

1192 GATA GT AAGGAAGAT AGT GAT AT GTTGGCTGCT GGAG GT AAAAT C GAGAGCAACT T G 124 8 

14 38 GAAAGTAAAGT GGACAGAAAAT G C T T G GAAGAT AGC CT GGAGCAAAAAAGT CT T GGGAAG 14 97 

N I M M M I I I II I II I I I I || | | || || | | M II I I II I III I || 
124 9 GAAAGTAAAGT G GAT AAAAAAT GT T T T G CAGAT AG C CT T GAGCAAACTAAT C AC GAAAAA 1308 

14 98 GAT AGT GAAG GCAGAAAT GAGGAT GCTTCTTTCCC C AGT AC C C CAGAAC CT GT GAAGGAC 1557 

H I I I I I M M M I M II II II II II II M I II I M I I I I I I I 

130 9 GAT AGT GAGAGT AGTAAT GAT GAT ACTT CTTT C CCCAGTACGCCAGAAGGTATAAAGGAT 1368 

155 8 AG CT C CAGAGCAT ATAT T AC C T GT GCTTCCTT T ACCT C AGCAAC C GAAAGC AC C AC A 1614 

I I I I II II I M I M M I I M I M II I I I II II II I I I I II II 

1369 C GT C C AGGAGC AT AT AT CACAT GTGCTCCCTT TAAC C CAGCAGCAAC T GAGAGC ATT GCA 1428 

1615 GCAAACACTTT C CCTTT GTTAGAAGAT CAT ACTT CAGAAAATAAAACAGATGAAAAAAAA 1674 

HUM III II II II I I || I I || | || || | | | || | || M | M | | || || || || | | 
1429 AC AAACAT TTTTCCTTTGT TAGGAGAT C C T ACTT CAGAAAATAAGAC C GAT GAAAAAAAA 14 8 8 

1675 AT AGAAGAAAG GAAGGC C C AAAT T AT AAC AGAGAAG AC TAG C C C C AAAAC GT C AAAT 1731 

I I I I I I I I II I II I II I II I I II I I II I I I M I I I I I II M I I I I I 

148 9 AT AGAAGAAAAGAAGGC C CAAAT AGT AACAGAGAAGAAT ACT AG C AC CAAAAC AT CAAAC 154 8 

1732 C CTTT CCTT GT AGCAGTACAGGAT T CT GAGGCAGATTATGTTACAACAGATACCTTAT CA 17 91 

I I I I I M M I M I M I II I II I M I II I I I II II I M I II II I I II I I Ml | | 
1549 CCTTTTCTT GT AGCAGCACAGGAT T CT GAGACAGATT AT GT C ACAAC AGATAAT T TAAC A 160 8 

1792 AAGGT GACT GAGGCAGCAGT GT CAAACATGCCT GAAGGTCT GAC GC CAGAT TTAGTT CAG 1851 

I I I I I I N II II I II Ml MM II II I M Mill II I II II I II I III 

1609 AAGGT GAC T GAG GAAGT C GT G GCAAAC AT G C CT GAAGGC CT GACT C CAGAT T T AGTAC AG 1668 
18 52 GAAGCAT GT GAAAGT GAACT GAAT GAAGCCACAGGTACAAAGATT GCTT AT GAAACAAAA 1911 

N M M II II II M II II I II II M II II II I II II I || || || II I II I M II I I I 

1669 GAAGCAT GT GAAAGT GAATT GAAT GAAGTTACTGGTACAAAGATT GCTT AT GAAACAAAA 1728 

1912 GT G GACT T GGT C CAAACAT CAGAAGC T AT AC AAGAAT CAC T TT AC CC CAC AG C AC AGCT T 1971 

I I I I N II I I II II I I || I || | | || I IMM II II I II I I M M I I 

1729 AT G GACT T GGT T CAAACAT CAGAAGT TAT GCAAGAGT CACT CT AT CC T GC AGC ACAGCT T 178 8 

1972 T GC C CAT CAT T T GAGGAAG CT GAAGC AAC T C C GT CAC CAGT T T T GC C T GAT AT T GT TAT G 2031 

M M M I I I II I I II II I II I I I I II I II I I II I II I II I I II 

1789 T GC C CAT CAT T T GAAGAGT CAGAAGC T AC T C CT T CAC CAGT TT TGC C T GACAT T GT TAT G 184 8 

2032 GAAGCACCATTAAATTCTCTCCTTCCAAGCGCTGGTGCTTCTGTAGTGCAGCCCAGTGTA 2091 
M I I M I M I I II II M III II II I I II I II IMM 



Db 1849 GAAGCACCATTGAATTCTGCAGTTCCTAGTGCTGGTGCTTCCGTGATACAGCCCAGCTCA 1908 

Qy 2092 T C C C C ACT G GAAGC AC CT C C T C C AGTT AGT TAT GAC AGT ATAAAGCT T GAGC CT GAAAAC 2151 

II HI I MM M II MINI I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1909 T C AC CAT T AGAAG C T T CT T CAGTTAAT TAT GAAAGC ATAAAAC AT GAGC CT GAAAAC 19 65 

Qy 2152 C C CC C AC CAT AT GAAGAAGC C AT GAAT GTAG CACT AAAAGCT T T G GGAAC AAAGGAA 2208 

I I M I M I I I I I I II I I II I I I | | | | | | | | | | | | | | | | | | || | | | | | || 
Db 1966 C C CC CAC CAT AT GAAGAGG C CAT GAGT GTAT C ACT AAAAAAAGT AT CAGGAAT AAAGGAA 202 5 

Qy 2209 GGAAT AAAAGAG C C T GAAAGT T T TAAT GCAGCT GT T CAGGAAAC AGAAG C T C C T TAT AT A 2268 

I Ml I II I I I I I I I I I I | | | || | | | | M | M I I I I I II I I I I I I I I I I II I I II 
Db 2026 GAAATTAAAGAG C CT GAAAAT AT TAAT G CAGCT CT T CAAGAAAC AGAAGCT C C T TAT ATA 2085 

Qy 2269 T C CAT T GC GT GT GAT T TAAT T AAAGAAACAAAG CT CT C CACT GAG C C AAGT C CAGAT TT C 2328 

M MMI I I I I I I I I I I II I I II I I I II I I I I I II I I I I Ml III I I I | | | 
Db 2086 T CT AT T GC AT GT GAT T TAAT T AAAGAAACAAAGCT TT CT GC T GAAC C AG CT C C GGAT TT C 2145 

Qy 232 9 T CTAAT T ATT C AGAAAT AGCAAAAT T C GAGAAGT C GGT G C C C GAAC AC G C T GAGC T AGT G 238 8 

I M I II II I M I I II I I I I I I I I II I I I I II I I I I I I 

Db 2146 T CT GAT T ATT CAGAAAT GGCAAAAGT T GAAC AG C CAGT GC C T GAT CAT T C T GAG C T AGT T 2205 

Qy 2389 GAG GAT T C CT C AC CT GAAT CT GAAC CAGT T G ACT TAT TT AGT GAT GAT T C GAT T C CT GAA 2 44 8 

M M I I I I I I I I I I II I I I | | | M I I I I I I I I I I I II I I I I | | | | M | || Mill 
Db 22 0 6 GAAGATTCCT CACCT GAT T CT GAACCAGTT GACTTATTTAGT GATGATT CAATAC CT GAC 2265 

Qy 244 9 GT C C C ACAAAC ACAAGAG GAGGC T GT GAT GCT CAT GAAGGAGAGT CT CACT GA A 2 502 

M I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I II I I M I II I I I 
Db 2266 GTTCCACAAAAACAAGATGAAACTGTGATGCTTGTGAAAGAAAGTCTCACTGAGACTTCA 2 325 

Qy 2503 GT GT C T GAGACAGT AG C C CAGC ACAAAGAGGAGAGAC T T AGT G C C T C AC CT C AG GAG CT A 2562 

I I Ml I III I I II I I II II I II I III || | I 

Db 2326 T T T GAGT CAAT G AT AGAAT AT GAAAAT AAGGAAAAAC T CAGT G CT TT GC C AC C T GAGGGA 2385 

Qy 2 563 GGAAAG C CAT AT T T AGAGT CT T T T C AGC C CAAT T T AC AT AGT AC AAAAGAT GC TGCA 2 619 

M M II II II I M || II I I I I II I I M I I I I 

Db 2386 GGAAAG C CAT AT T T GGAAT C T T T T AAGCT CAGT T T AGAT AAC ACAAAAGAT AC C C T GT T A 2445 

Qy 2 620 T CTAAT GACAT T C CAACATT GACCAAAAAGGAGAAAATTTCTTT GCAAAT GGAAGAGTTT 2 679 

M I I I I II I I II II I I I I II I I I II I I I II I I I I I I || || || M I III I 
Db 24 4 6 C C T GAT GAAGT T T CAACAT T GAGCAAAAAGGAGAAAAT T C CT T T GC AGAT G GAGGAGCT C 2 505 

Qy 2680 AAT AC T G CAAT T TAT T CAAAT GAT GACT T AC T T T CT T C TAAG GAAGACAAAAT AAAAGAA 2739 

I M M I M I II M I I I I I II II I I I I I I II I I I I I I I || | | | || || Mil 
Db 2506 AGT ACT GC AGT T TAT T CAAAT GAT GACT TAT T TAT T T CTAAG GAAGC AC AGAT AAGAGAA 2565 

Qy 2740 AGT G AAAC AT T T T CAGAT T CAT C T C C GAT T GAG AT AAT AGAT GAAT T T C C CAC GT T T GT C 27 9 9 

I M I I I I II I I I II I II I II I II I Mill II II I II M I II II I 

Db 2566 AC T GAAAC GT T T T CAGAT T CAT CT C CAAT T GAAAT TAT AGAT GAGT T C C CT AC AT T GAT C 2625 

Qy 28 00 AGT G C T AAAGAT GAT T C T C CT AAAT T AGC CAAG GAGT ACACT GAT CT AGAAGT AT C C 2856 

Ml II I II I II I II I I I I I I I II II I I II I II I I I I I II M I II I I II I 
Db 2 626 AGTTCTAAAACTGATTCATTTTCTAAATTAGCCAGGGAATATACTGACCTAGAAGTATCC 2 68 5 

Qy 28 57 GACAAAAGT GAAAT T GC TAAT AT C CAAAGC GGGGCAGAT T CAT TGCCTTGCT T AGAAT T G 2 916 

Ml I MUM II Mill I II II I I I II I II 

Db 2686 C ACAAAAGT GAAAT T G CTAAT GC C C C GGAT G GAG C T G G GT CAT T G C CT T GC AC AGAAT T G 274 5 



2917 C C C T GT GAC CTTTCTTT CAAGAAT AT AT AT C C TAAAGAT GAAG TACATGTTTCA 2970 

Ml I M I I I I II I I I II I I I I I I I I I I I I I II | | | | | | M I 

274 6 CC C CAT GAC CTTTCTTT GAAGAAC AT ACAAC C CAAAGT T GAAGAGAAAAT CAGT T T CT C A 28 05 

2 971 GAT GAAT T CT C C GAAAAT AGGT C CAGT GT AT CT AAG GC AT C CAT AT C GC CT T C AAAT GT C 3030 
I I I M I I I I I I I I I II II I I I I I I I I I I I I I I I I I I I | I I 

2 806 GATGACTTTTCTAAAAATGGGTCTGCTACATCAAAGGTGCTCTTATTGCCTCCAGATGTT 2865 

3031 TCTGCTTTG GAAC CT C AGAC AGAAAT G GGCAGCAT AGTTAAAT C CAAAT C ACT T AC GAAA 3090 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I | | | | | | | Ml | | | | 
2866 TCTGCTTTGGC CACT CAAGC AGAGAT AGAGAGCAT AGT TAAAC C CAAAGT T CT T GT GAAA 2925 

3091 GAAGC AGAGAAAAAACT T C CT T CT GACACAGAGAAAGAG GAC AGAT C C C T GT CAG C T GTA 3150 

Mill I I I I I I I I II II II I I I II I I I II I I I II I I | | | | | | | | || Ml || 
2926 GAAGCT G AGAAAAAACT T C CT T C C GAT AC AGAAAAAGAGGAC AGAT C AC CAT CT G CT ATA 2985 

3151 T T GT C AGC AGAGC T GAGT AAAAC T T CAGT T GTT GACCT C CT C T AC T GGAGAGAC AT TAAG 3210 

M I I I I I I I I I I II I I I I I I I I I I I M I II I I I I I II I I I I I I I I I I I I | | | | | || | | 
298 6 T T T T C AGC AGAG C T GAGT AAAAC T T CAGT T GT T GAC CT C CT GT AC T GGAGAGACAT TAAG 3045 

3211 AAGACTGGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGC 327 0 
I I I I I I M I I I I I I I I I I I I I I I I I || I I I I I I I I I | | | | M I I I I I I I MINI 

3 046 AAGACTGGAGTGGTGTTTGGTGCCAGCCTATTCCTGCTGCTTTCATTGACAGTATTCAGC 3105 

3271 ATTGTCAGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGG 3330 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | M I I I I I II I I I 
3106 ATTGTGAGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGG 3165 

3331 AT AT AT AAG G G C GT GAT C C AGGCT AT C CAGAAAT CAGAT GAAG GC C AC C CAT T CAGGG CA 3390 

I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | || | | | | | || || | | | | | | | | 
3166 AT AT ACAAG G GT GT GAT C CAAGCT AT C CAGAAAT CAGAT GAAGGC C AC C CAT T C AGGGCA 3225 

3391 TATTTAGAAT CT GAAGTT GCTATAT CAGAG GAAT T GGTT CAGAAAT AC AGTAATT CTGCT 3450 

Ml I I I I M I I I I I I M I I I I I I I Mill I I I I II I I I I I | | | | | | | | | || | | | | 
3226 TAT CT GGAAT CT GAAGTT GCTATAT CT GAG GAGT T GGT T CAGAAGTACAGTAATT CTGCT 3285 

3451 CT T GGT CAT GT GAAC AGCACAATAAAAGAAC T GAGGC G GCT T T T C TT AGTT GAT GAT T T A 3510 

I I I I I I I I I I I I I I I I I I I Mill Mill I I II I II M I I II I I I II I 

3286 CT T GGT CAT GT GAACT GC AC GATAAAGGAAC T C AGGC G C CT C T T CTT AGT T GAT GAT T T A 3345 

3511 GTTGATTCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTC 3570 

I I I I I M I I I M I II M I II I II II I I II II II I II II I II I II I I || || || I I II 
3346 GTTGATTCTCTGAAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTT 34 05 

357 1 AATGGTCTGACACTACTGATTTTAGCTCTGATCTCACTCTTCAGTATTCCTGTTATTTAT 3630 

I I I I I I I I I M I I II I I I I II II II II I II II I I II II II II I II I I II I I I II II 

34 06 AAT G GT CT GAC ACT AC T GAT TTTGGCTCT CAT T T CAC T CT T CAGT GT T C CT GT TAT T TAT 34 65 

3631 GAAC G GC AT C AG GT G CAGAT AGAT CAT TAT C T AGGAC T T G CAAACAAGAGT GT TAAGGAT 3690 

M M II I II M I I I II II I I I I II II I || || || || || | || | || MM I I II II III 
3466 GAAC GG CAT CAG G C GCAGAT AGAT CAT TAT CT AG GACT T GCAAATAAGAAT GT TAAAGAT 3525 

3691 GCCATGGCCAAAATCCAAGCAAAAATCCCTGGATTGAAGCGCAAAGCAGA 3740 

II I I I I I I M I I II II II I II II II I || I I II I I II II II II 

3526 GCT AT GGCTAAAATCCAAGCAAAAATCCCTGGATT GAAGC GCAAAGCTGA 3575 



RESULT 12 
US-10-220-891-22 

Sequence 22, Application US/10220891 
Publication No. US20030207286A1 
GENERAL INFORMATION: 
APPLICANT: N AKAG AW ARA , AKIRA 

TITLE OF INVENTION: NUCLEIC ACID SEQUENCES HAVING CHARACTERITICS OF ENHANCED 
TITLE OF INVENTION: EXPRESSION IN HUMAN NEUROBLASTOMA WITH FAVORABLE 
PROGNOSIS 

; TITLE OF INVENTION: BASED ON COMPARISON BETWEEN HUMAN NEUROBLASTOMA WITH 
FAVORABLE 

; TITLE OF INVENTION: PROGNOSIS AND HUMAN NEUROBLASTOMA WITH UNFAVORABLE 
PROGNOSIS 

FILE REFERENCE: 7388-73435 

CURRENT APPLICATION NUMBER: US/10/220, 891 
CURRENT FILING DATE: 2003-03-07 
PRIOR APPLICATION NUMBER: JP 2000/140387 
PRIOR FILING DATE: 2000-05-12 
PRIOR APPLICATION NUMBER: JP 2000/159195 
PRIOR FILING DATE: 2000-03-07 
NUMBER OF SEQ ID NOS : 108 
SOFTWARE : Patentln version 3.2 
SEQ ID NO 22 
LENGTH: 19 80 
TYPE : DNA 

ORGANISM: Homo sapiens 
US-10-220-891-22 

Query Match 29.1%; Score 1088.8; DB 16; Length 1980; 

Best Local Similarity 83.5%; Pred. No. 3.8e-275; 

Matches 128 9; Conservative 0; Mismatches 237; Indels 18; Gaps 4; 

Qy 2215 AAAGAGC CT GAAAGT T T TAAT G CAGCT GT T C AGGAAACAGAAGCT C C T T AT AT ATC CATT 2274 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I III 

Db 28 AAAGAGC CT GAAAAT AT TAAT GCAGC T C T T CAAGAAAC AGAAGCT C C T T AT AT AT CTAT T 87 

Qy 2275 GCGT GT GATTT AATTAAAGAAACAAAGCT CT CCACTGAGCCAAGTCCAGATTT CTCTAAT 2334 

II I I I I I I I I II I I I I I I I I I I I I I I I I II MM Ml Ml I I M M M I M 

Db 8 8 GC AT GT GAT T T AATTAAAGAAACAAAG CT TT C T GCT GAAC CAGC T C C GGATTT CT CT GAT 147 

Qy 2335 TAT T C AGAAATAGCAAAAT T C GAGAAGT C GGT GC C CGAAC AC GCT GAGC T AGT G GAGGAT 2394 

M II I M M II I II M I I M M I I II II M M I M I M M M M Ml 

Db 148 TAT T C AGAAAT G GCAAAAGT T GAACAGC CAGT GC CT GAT CAT T CT GAGCT AGT T GAAGAT 207 

Qy 2395 T C C T C AC CT GAAT CT GAAC CAGT T GACT T AT T TAGT GAT GAT T C GAT T C CT GAAGT C CC A 24 54 

I M I II I II II I M II I I I I I I I II M I M II I II I I II I I II II Mill I I Ml 

Db 2 08 T C CT C AC CT GAT T CT GAAC CAGT T GACT T ATT TAGTGAT GAT T CAAT AC C T GAC GT T CC A 2 67 

Qy 2455 C AAAC AC AAGAGGAG GCT GT GAT GCT CAT GAAGG AGAGT CT C AC T G A AGTGTCT 2508 

I I I I I II I I II I II II I I M I I II I I I I II I M I I II I II 
Db 2 68 CAAAAACAAGGT GAAACT GT GAT GCTT GT GAAAGAAAGT CT CACT GAGACT T CATT T GAG 327 

Qy 2 509 GAGAC AGTAGCCCAGCACAAAGAGGAGAGACT TAGTGCCT CACCT CAGGAGCTAGGAAAG 2568 

I Ml I Ml I I I I I I I I I I I I I I III III I I I M II 
Db 328 T CAAT GATAGAAT AT GAAAAT AAG GAAAAAC T CAGT G C T T T G C CAC C T GAGGGAGGAAAG 387 



2569 C CAT AT T T AGAGT C T T T T C AG C C CAAT T T AC AT AGT AC AAAAGAT G C T G CAT C T AAT 2625 

MINIM II II I II I I I I III MINIMI I I M II 

38 8 CCATATTTGGAATCTTTTAAGCTCAGTTTAGATAACACAAAAGATACCCTGTTACCTGAT 447 

262 6 GACAT T C CAACATT GAC CAAAAAGGAGAAAATTT CTTT GCAAAT GGAAGAGTTTAAT ACT 2685 

Nlll I I II I II II II I II M Mill IN I I I I I I 

44 8 GAAGT T T CAAC AT T GAGCAAAAAG GAGAAAAT T C CTT T G CAGAT GGAG GAGCT C AGTAC T 507 

2686 GCAATT T AT T CAAAT GAT GACT TAC T T T CT T C T AAGGAAGACAAAATAAAAGAAAGTGAA 2745 

N I I N I N II I I I II II I I I II II II I II I II I II I I || I MMI I I I I 

508 GCAGTTTATT CAAAT GAT GACTTATTTATTT CTAAGGAAGCACAGATAAGAGAAACTGAA 567 
274 6 ACAT TT T CAGAT T CAT C T C C GAT T GAGATAAT AGAT GAAT T T C C CAC GTTT GT CAGT GCT 2805 

N I I I I I I I N II I II I M I I II I I MM II II II II Mill M 

568 AC GT T T T CAGAT T CAT C T C CAATT GAAAT TAT AGAT GAGT T C C CTACAT T GAT C AGT C CT 627 
2806 AAAGAT GATT C T CCTAAATTAGC CAAGGAGTACACT GATCTAGAAGT AT C CGACAAA 2862 

IN I N I II I II II M II II I I II || || M | | m m I I I I M I 

62 8 AAAACT GATT CATTTT CTAAATTAGC CAGGGAATATACT GACCTAGAAGTAT CC CACAAA 687 

2863 AGT GAAAT T GCT AAT AT C C AAAG C GGGGC AGAT T CAT TGCCTTGCT T AGAAT T G C C CT GT 2 922 
N N I I I I I II I I II II II II I I M M II II I I I M I I II II I I I 

68 8 AGT GAAAT T GCTAAT GC C C C GGAT G GAG C T GGGT CAT T GCCTT GCACAGAAT T GC C C CAT 747 

2923 GAC CTTTCTTT C AAGAAT AT AT AT C C T AAAG AT GAAG TACAT GTTT CAGAT GAA 2976 

N I N N N N II II I III I II MM I II I I I | | || || || | | 

74 8 GACCTTT CTTT GAAGAACATACAACCCAAAGTTGAAGAGAAAATCAGTTT CT CAGAT GAC 807 

2977 TTCTCCGAAAATAGGTCCAGTGTATCTAAGGCATCCATATCGCCTTCAAATGTCTCTGCT 3036 

I'll I I | || I I I II I I II I I I I M I I I I I I I || I 

808 TTTTCTAAAAAT GGGT CTGCTACATCAAAGGT GCT CTTATTGCCTCCAGAT GTTT CTGCT 867 

3037 TTGGAACCTCAGACAGAAATGGGCAGCATAGTTAAATCCAAATCACTTACGAAAGAAGCA 3096 

Nil I II II I II I M I || || | || M M I I I I II Ml I M II II II 

868 T T GG C CACT CAGGC AGAGAT AGAGAGC AT AGT T AAAC C CAAAGT T CT T GT GAAAGAAGCT 927 
3097 GAGAAAAAACTT C C T T CT GAC ACAGAGAAAGAGGAC AGAT C C CT GT CAGCT GT AT T GT C A 3156 

N N N II II II I II II II I II II I M II II M I II III MM III 

92 8 GAGAAAAAACT T C CT T C C GAT ACAGAAAAAGAG GAC AGAT CAC CAT C T GCT AT AT T T T C A 987 
3157 G C AGAGC T GAGTAAAACT T CAGTT GT T GAC CT C CT CT ACT GGAGAGAC AT TAAGAAGACT 3216 

N N II I I II I II I II II I I II II I I II I I || I II I II M I I I II II I I I I II I I I M I 

98 8 G CAGAGC T GAGT AAAACT T CAGT T GT T GAC CT C C T GT ACT GGAGAGAC AT TAAGAAGACT 1047 

3217 GGAGTGGTGTTTGGTGCCAGCTTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTC 3276 

N N I I II II II I I I I II II I I I II I I I I || || M I I I I II I I M I II I I II I 
104 8 GGAGTGGT GTTT GGTGCCAGCCTATTCCAGCTGCTTTCATTGACAGT ATT CAGCATTGTG 1107 

3277 AGTGTAACGGCCTACATTGCCTTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATAT 3336 

N N M I II I I II I II I I II I II M Ill I I M II I II I M I I II I 

1108 AGCGTAACAGCCTACATTGCCTTGGCCCTGCTCTCTGTGACCATCAGCTTTAGGATATAC 1167 

3337 AAGG GC GT GAT C C AG GC T AT C C AGAAAT CAGAT GAAGGC C AC C CAT T CAGG G CAT AT TT A 3396 

INN I II I II I I I II I I II I I I || | || | || || || | | | || || || | || || | 

1168 AAG GGT GT GAT C CAAGC T AT C C AGAAAT CAGAT GAAGGC CAC C CAT T C AGGG CAT AT CT G 1227 

3397 GAAT CT GAAGT T G C TAT AT C AGAG GAAT T G GT T CAGAAAT ACAGT AAT TCTGCTCTTGGT 3456 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I 
1228 GAAT CT GAAGTT GCT ATAT CT GAGGAGTT GGTT CAGAAGTACAGTAATT CT GCT CTT GGT 1287 



Qy 34 57 CAT GT GAAC AGC ACAATAAAAGAACT GAGG CGG C T T T T C T TAGT T GAT GAT TT AGT T GAT 3516 

I I I I I I I I I MM M M I M I M M M I M M M M I M M M M M M M M I 
Db 1288 C AT GT GAAC T GCAC GAT AAAGGAACT CAG GC G CCTCTTCT TAGT T GAT GAT TT AGTT GAT 1347 

Qy 3517 TCCCTGAAGTTTGCAGTGTTGATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGT 357 6 

I I Ml M M M M M M M M M M M M M I M M M M M M I M M I M M I 

Db 1348 TCTCTGGAGTTTGCAGTGTTGATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGGT 1407 

Qy 3577 CT GACACTACT GATTTTAGCT CT GAT CTCACT CTT CAGTATTCCTGTTATTTAT GAACGG 363 6 

M M M M M M M M I M M I M M M M M M M M M M M M M M M M M 

Db 1408 CTGACACTACTGATTTTGGCTCTCATTTCACTCTTCAGTGTTCCTGTTATTTATGAACGG 1467 

Qy 3637 CAT CAG G T G CAG AT AG AT CAT TAT C TAG G AC T T G C AAAC AAG A GT GT T AAG GAT G C CAT G 3696 

M M I M M M M I M I M I M M M I M M M M I MM I M M I I I M I I I I 

Db 14 68 CAT CAGGC ACAGAT AGAT CATT AT CTAGGACT T GCAAAT AAGAAT GT T AAAGAT GCT AT G 1527 

Qy 3697 G C CAAAAT C CAAGCAAAAAT C C C T GGATT GAAGC GCAAAGCAGA 3740 

M I M I I M M M M M M M M I M M M M I M M M I M 

Db 1528 GC TAAAAT C CAAGCAAAAAT C C CTGGATT GAAGC G CAAAG CT GA 1571 



RESULT 13 
US-10-205-194-165 

Sequence 165, Application US/10205194 
Publication No. US20030134301A1 
GENERAL INFORMATION: 
APPLICANT: Warner-Lambert Company 
APPLICANT: Lee, Kevin 
APPLICANT: Dixon, Alistair 
APPLICANT: Brooksbank, Robert 
APPLICANT: Pinnock, Robert 

TITLE OF INVENTION: Identification and Use of Molecules Implicated in Pain 
FILE REFERENCE: WL-A-018201 

CURRENT APPLICATION NUMBER: US/10/205, 194 
CURRENT FILING DATE: 5200-07-24 
PRIOR APPLICATION NUMBER: GB 0118354.0 
PRIOR FILING DATE: 2001-07-27 
NUMBER OF SEQ ID NOS : 177 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 165 
LENGTH: 2782 
TYPE: DNA 

ORGANISM: Rattus norvegicus 
FEATURE : 

OTHER INFORMATION: Foocen-m2 reticulon 
US-10-205-194-165 



Query Match 21.6%; 
Best Local Similarity 99.8%; 
Matches 811; Conservative 



Score 809.8; DB 15; Length 2782; 
Pred. No. 1.5e-201; 
0; Mismatches 2; Indels 0; Gaps 



0; 



Qy 



Db 



14 GCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAGGCAG 73 

M M M I I M M M M M M M M M M M I M I M M M M M M M M M M M M M 

4 62 GCGGCGGCGGCGGCTGCAGCCTGGGACAGGGCGGGTGGCACATCTCGATCGCGAAGGCAG 521 



Qy 74 CAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGCTCGG 133 

I I I I II I I I I I I I I II I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 522 GAGAAGCAGTCTCATTGTTCCGGGAGCCGTCGCCTCTGCAGGTTCTTCGGCTCGGCTCGG 581 

Qy 134 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 193 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 582 CACGACTCGGCCTGCCTGGCCCCTGCCAGTCTTGCCCAACCCCCACAACCGCCCGCGACT 641 

Qy 194 CTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCA 253 

I I I I I I I I I I I I I I I II I I 1 I 1 I I II I I I i I I I I I I I II I II II I I I I I I I I I I I I I I I I 
Db 642 CTGAGGAGAAGCGGCCCTGCGGCGGCTGTAGCTGCAGCATCGTCGGCGACCCGCCAGCCA 701 

Qy 254 TGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTC 313 

I 1 I II I I I II I I II M I II II I II II I II I II II I I I I I II II I I I I I II I I I I II I I I I 

Db 7 02 TGGAAGACATAGACCAGTCGTCGCTGGTCTCCTCGTCCACGGACAGCCCGCCCCGGCCTC 761 

Qy 314 C G C C C G C CTT CAAGT AC C AGT T C GT GACG GAGC C C GAGGAC GAGG AGGAC GAGGAGGAGG 373 

II I M I I I II I I I I I II I II II II I II I M I I I I I M I II II I I M M I I I I I I II II M 

Db 762 CGCCCGC CTT CAAGT AC CAGTT C GT GACGGAGCC C GAGGACGAGGAGGAC GAGGAGGAGG 821 

Qy 374 AGGAGGAC GAG GAGGAGGACGAC GAG GAC CTAGAGGAAC T GGAGGT GC T GGAGAG GAAGC 433 

I I I I II M M I M I I I I I I I M M I M I I I I I I M I I I I I II I M I I I I M I II I M I II 

Db 822 AGGAGGAC GAGGAGGAGGACGAC GAGGAC CTAGAGGAACT GGAGGT GCTGGAGAGGAAGC 881 

Qy 434 CCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGACT 493 

I II I I I I I II II I I I I I M I II II M M M M M II I I M I M I II II I M M M M I M 

Db 8 82 CCGCAGCCGGGCTGTCCGCAGCTGCGGTGCCGCCCGCCGCCGCCGCGCCGCTGCTGGACT 941 

Qy 4 94 TCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCCG 553 

II II M I I II M I I II II M II I I I I II I M I II I I I II II I II I M II II II M I M I I 

Db 942 TCAGCAGCGACTCGGTGCCCCCCGCGCCCCGCGGGCCGCTGCCGGCCGCGCCCCCTGCCG 1001 

Qy 554 CTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCGC 613 

II II I I II I I I I M I I I I I I I I I I I II I I I I I I I M I I II M I I II I I I I I I I II M I I I 

Db 1002 CTCCTGAGAGGCAGCCATCCTGGGAACGCAGCCCCGCGGCGCCCGCGCCATCCCTGCCGC 1061 

Qy 614 CCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCCC 673 

I I I I I I I I I I I I I I I I I II I I II I I I II I I I I I II I I I I I I I I I I II I I I I M I I II I I I 

Db 1062 CCGCTGCCGCAGTCCTGCCCTCCAAGCTCCCAGAGGACGACGAGCCTCCGGCGAGGCCCC 1121 

Qy 674 CGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACGC 733 

II I I I II I I I II I I I I I I I I I I I I I I I II I I I I I I I II I I I I I I I M I I II I I I I M I I I 

Db 1122 CGCCTCCGCCGCCAGCCGGCGCGAGCCCCCTGGCGGAGCCCGCCGCGCCCCCTTCCACGC 1181 

Qy 734 CGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTG 793 

I I I I I II I I I I I I I II I I I I I I I I M I I I II I II I I I I I I I I I I I I I I I II I I I I I II II 
Db 1182 CGGCCGCGCCCAAGCGCAGGGGCTCCGGCTCAGTGGATGAGACCCTTTTTGCTCTTCCTG 1241 

Qy 794 CTGCATCTGAGCCTGTGATACCCTCCTCTGCAG 826 

I I I I I I I II I II I I I I I II I I I I I I I I I I I I I 
Db 1242 CT GCAT CT GAAC CT GT GAT AC CCTCCTCTG CAG 1274 



RESULT 14 
US-10-660-946-2 

; Sequence 2, Application US/10660946 



; Publication No. US20040063131A1 
GENERAL INFORMATION: 

APPLICANT: Bandman, Olga 
/' Au-Young, Janice 

; Goli, Surya K. 

f Hillman, Jennifer L. 

TITLE OF INVENTION: TWO NOVEL HUMAN NSP-LIKE PROTEINS 
; NUMBER OF SEQUENCES: 9 

CORRESPONDENCE ADDRESS: 

ADDRESSEE : Incyte Pharmaceuticals, Inc. 
STREET: 3174 Porter Drive 
CITY: Palo Alto 
STATE: CA 
COUNTRY: U.S. 
; ZIP: 94304 

COMPUTER READABLE FORM: 

MEDIUM TYPE: Diskette 
COMPUTER: IBM Compatible 
OPERATING SYSTEM: DOS 
SOFTWARE: FastSEQ Version 1.5 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/660, 946 
FILING DATE: 12-Sep-2003 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/ 09/22 8 , 2 13A 
FILING DATE: <Unknown> 
; APPLICATION NUMBER: 08/700,607 

FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
NAME: Billings, Lucy J. 
; REGISTRATION NUMBER: 36,749 

REFERENCE/DOCKET NUMBER: PF-0114 US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: 415-855-0555 
; TELEFAX: 415-845-4166 

INFORMATION FOR SEQ ID NO: 2: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 799 base pairs 
; TYPE: nucleic acid 

STRANDEDNESS: single 
TOPOLOGY: linear 
MOLECULE TYPE: cDNA 
; IMMEDIATE SOURCE: 

LIBRARY: <Unknown> 
CLONE: Consensus 
SEQUENCE DESCRIPTION: SEQ ID NO: 2: 
US-10-660-946-2 

Query Match 13.3%; Score 497.4; DB 13; Length 799; 

Best Local Similarity 92.7%; Pred. No. le-119; 

Matches 522; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 

QY 3178 GT T GT T GAC CT C C T CT AC T G GAGAGAC AT T AAGAAG ACT G GAGT GGT GT TT GGT GC CAGC 3237 

I I I I M I I I I I II I I | | M I I I I I I I I I I I I I I | || | t | | | | | | | | | | || M | | 

Db 108 GT T GT T GAC CT C C T GT AC T G GAGAGAC AT T AAGAAG ACT G GAGT G GT GT T T GGT G C CAGC 167 



Qy 3238 TTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCC 3297 



Db 


168 


1 1 1 M 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 | | | | | | | | | M II I M 1 1 1 1 1 1 1 1 1 1 1 1 II 
CT AT TCCTGCTGCTTT CAT T G AC AGT ATT C AGC AT T GT GAGC GTAAC AGC CT AC ATT GC C 


227 


Qy 


3298 


TTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTATC 
1 1 1 1 1 1 1 M 1 1 1 II 1 1 1 1 1 1 1 I I 1 I | | || | | | | M | | | | | | || | | | || | [|l||| 

TTGGCCCTGCTCTCTGTGACCATCAGCTTTAGGATATACAAGGGTGTGATCCAAGCTATC 


3357 


Db 


228 


287 


Qy 


3358 


CAGAAAT CAGAT GAAGGCCAC CCATTCAGGGCAT AT TT AGAAT CT GAAGTTGCTATAT CA 

1 1 1 N 1 1 1 1 1 M 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 || | | | | I | 1 1 1 I | M 1 1 M 1 1 1 1 M II 1 

CAGAAAT CAGAT GAAGGC C AC C CAT T C AG GG CAT AT C T G GAAT C T GAAGT T G C TAT AT CT 


3417 


Db 


288 


347 


Qy 


3418 


GAGGAATTGGTT CAGAAAT ACAGTAATT CT GCT CTT GGT CAT GT GAACAGCACAATAAAA 

1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 II 1 1 1 1 1 1 | | | | | | | | | | | | | | | | | | | || | | | | Mill 

GAGGAGTT GGTT C AGAAGT AC AGT AAT TCTGCTCTTGGT CAT GT GAACT GC AC GATAAAG 


3477 


Db 


348 


407 


Qy 


3478 


GAACTGAGGCGGCTTTTCTTAGTTGATGATTTAGTTGATTCCCTGAAGTTTGCAGTGTTG 

I 1 1 1 1 1 M II II 1 M 1 1 1 1 II 1 1 1 1 1 1 | | | | | | | | | | | | | | | | | | || || || | | | | | 

GAACT CAGGC GC CT C T T CT T AGT T GAT GAT T T AGT T GAT T C T CT GAAGT T T GCAGT GTT G 


3537 


Db 


408 


467 


Qy 


3538 


ATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCT 
1 1 1 1 1 1 1 1 | | M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I | | | | | | | || | | | M | | | | | | | | || M | 

ATGTGGGTATTTACCTATGTTGGTGCCTTGTTTAATGGTCTGACACTACTGATTTTGGCT 


3597 


Db 


468 


527 


Qy 


3598 


C T GAT CT CAC T CT T CAGT AT T C CT GT TAT T TAT GAAC GG C AT C AGGT G CAGAT AGAT CAT 
II N 1 1 1 1 1 II I M I I | I | | M | | | | | || | | | | | | | | | | | | || | | | 1 1 1 1 1 1 1 II 
C T CAT T T CAC T CTT CAGT GT T C CT GT TAT T TAT GAAC GG CAT C AGGC ACAGATAGAT CAT 


3657 


Db 


528 


587 


Qy 


3658 


TAT CTAG GAC T T GCAAACAAGAGT GT TAAGGAT GC C AT GGC CAAAAT C CAAG CAAAAAT C 

N 1 1 1 I 1 1 M 1 II 1 1 1 1 MM 1 II II 1 1 II II 1 1 1 1 1 1 1 M II 1 1 II 1 1 1 1 II 1 1 

TATCTAGGACTTGCAAATAAGAATGTTAAAGATGCTATGGCTAAAATCCAAGCAAAAATC 


3717 


Db 


588 


647 


Qy 


3718 


CCTGGATTGAAGCGCAAAGCAGA 374 0 
M II 1 II II 1 II II 1 I I I || || 
C C T GGAT T GAAGC GCAAAGCT GA 670 




Db 


648 





RESULT 15 
US-09-789-386-5 

; Sequence 5, Application US/09789386 

; Patent No. US20020010324A1 

; GENERAL INFORMATION: 

; APPLICANT: MICHALOVICH, DAVID 

; APPLICANT: PRINJHA, RABINDER KUMAR 

; TITLE OF INVENTION: NOVEL COMPOUNDS 

; FILE REFERENCE: GP-30165-C1 

; CURRENT APPLICATION NUMBER: US/09/789,386 

; CURRENT FILING DATE: 2001-02-21 

; PRIOR APPLICATION NUMBER: U.K. 9916898.1 

PRIOR FILING DATE: 1999-07-19 
; PRIOR APPLICATION NUMBER: U.K. 9816024.5 
; PRIOR FILING DATE: 1998-07-22 
; PRIOR APPLICATION NUMBER: US 09/359,208 
; PRIOR FILING DATE: 1999-07-22 
; NUMBER OF SEQ ID NOS : 6 

; SOFTWARE: FastSEQ for Windows Version 3.0 
; SEQ ID NO 5 

LENGTH: 1122 



TYPE: DNA 
; ORGANISM: HOMO SAPIENS 
US-09-789-386-5 

Query Match 13.3%; Score 497.4; DB 9; Length 1122; 

Best Local Similarity 92.7%; Pred. No. 1.3e-119; 

Matches 522; Conservative 0; Mismatches 41; Indels 0; Gaps 0; 

Qy 3178 GT T GTT GAC CT C CT CT ACT G GAGAGAC AT T AAGAAGAC T GGAGT GGT GT T T G GT GC C AGC 3237 

M I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | | | | | | | | M I I I I I I I I I I | | | | | | | | 
Db 556 GT T GTT GAC CT C CT GT AC T G GAGAGAC AT TAAGAAGAC T G GAGT GGT GT T T G GT GC CAGC 615 

Qy 3238 TTATTCCTGCTGCTGTCTCTGACAGTGTTCAGCATTGTCAGTGTAACGGCCTACATTGCC 3297 

M M I I I I I I I I I M I I I II I I I I I I I I I I I I I II I I I I I I I M I I I I I I I I 
Db 616 CTATTCCTGCTGCTTTCATTGACAGTATTCAGCATTGTGAGCGTAACAGCCTACATTGCC 675 

Qy 3298 TTGGCCCTGCTCTCGGTGACTATCAGCTTTAGGATATATAAGGGCGTGATCCAGGCTATC 3357 

I I M I I II I I I I I I I I I I I I I I [ | | | | | || M I I I I II I I I I I M I I I I I I I I I I 
Db 67 6 TTGGCCCTGCTCTCTGTGACCATCAGCTTTAGGATATACAAGGGTGTGATCCAAGCTATC 735 

Qy 3358 CAGAAAT CAGAT GAAGGC CAC C CAT T C AGGG C AT AT T T AGAAT CT GAAGT T GCT ATAT CA 3417 

I I I I I M I I I I I I II I I I I I I 1 I I I I I I I I I I I I II I I I I I I I I II I I I | M I I I II 
Db 73 6 CAGAAAT CAGAT GAAG GC C AC C CAT T C AGGGC AT AT CT GGAAT C T GAAGT T GCT AT AT CT 795 

Qy 3418 GAGGAATT GGTT CAGAAAT ACAGTAATT CT GCT CTTGGTCAT GT GAACAGCACAATAAAA 3477 

I I I I I I I I M I I I I I I M I I I I I I I I II I I I I I I I I I I I I I || | I | | | | | Mill 
Db 796 GAG GAGT T GGTT C AGAAGT AC AGTAAT TCTGCTCTTGGT CAT GT GAAC T GC AC GAT AAAG 855 

Qy 3478 GAACT GAGGC GGCT T T T CT T AGT T GAT GAT T T AGT T GAT T C C C T GAAGT T T G C AGT GT T G 3537 

Mill I I I I I II I I I I I I I I I I I I I I I I I | | | | | | || | I I I I I I I I | | | | | | | M I 
Db 856 GAAC T C AGGC GC CT CT T CT T AGT T GAT GAT T T AGT T GAT T C T CT GAAGT T T GC AGT GT T G 915 

Qy 3538 ATGTGGGTGTTTACTTATGTTGGTGCCTTGTTCAATGGTCTGACACTACTGATTTTAGCT 3597 

I M I I I II I I I I I I II I I || I I || | | | | | | I I I I I I I M I I I I I I I I I I I I II III 

Db 916 AT GT G G GTAT T TAC CTAT GT T G GT GC CTTGTTTAATGGTCTGACAC TACT GATTTT GGCT 975 

Qy 3598 CT GAT CT C ACT CT T CAGT ATT C C T GT TAT T TAT GAAC GGC AT C AGGT GC AGAT AGAT CAT 3657 

II II I M I I I I I I I I I I I I I I I I I || I I | | | | | | | | | | | | | | | I | | | | | | | | II I 

Db 976 C T CAT T T C ACT CT T CAGT GT T C CT GT T AT T TAT GAAC GGC AT CAG GCAC AGAT AGAT CAT 1035 

Qy 3658 TATCTAGGACTTGCAAACAAGAGTGTTAAGGATGCCATGGCCAAAATCCAAGCAAAAATC 3717 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I || 
Db 1036 TATCTAGGACTTGCAAATAAGAATGTTAAAGATGCTATGGCTAAAATCCAAGCAAAAATC 1095 

Qy 3718 CCTGGATTGAAGCGCAAAGCAGA 374 0 

I I I I I I I I I I I I I I I I I I I I II 
Db 1096 C C T GGAT T GAAG C GCAAAGCT GA 1118 

Search completed: September 11, 2004, 16:03:44 
Job time : 1604.94 sees 



